org.alfresco.repo.content.transform
Class TikaPoweredContainerExtractor

java.lang.Object
  extended by org.alfresco.repo.content.transform.TikaPoweredContainerExtractor

public class TikaPoweredContainerExtractor
extends java.lang.Object

Warning - this is a prototype service, and will likely change dramatically in Alfresco 4.0! This proto-service provides a way to have Apache Tika extract out certain kinds of embedded resources from within a container file. One use might be to extract all the images in a zip file, another might be to fetch all the Word Documents embedded in an Excel Spreadsheet. Uses the Apache Tika ContainerExtractor framework, along with the Apache Tika Auto-Parser. Not sprung-in by default, you will need to manually list this in an extension context file.


Nested Class Summary
static class TikaPoweredContainerExtractor.ExtractorActionExecutor
          This action executor allows you to trigger extraction as an action, perhaps from a rule.
 
Constructor Summary
TikaPoweredContainerExtractor()
           
 
Method Summary
 java.util.List extract(org.alfresco.service.cmr.repository.NodeRef source, java.util.List mimetypes)
          Extracts out all the entries from the container that match the supplied list of mime types.
 void setContentService(ContentService contentService)
          Injects the contentService bean.
 void setNodeService(org.alfresco.service.cmr.repository.NodeService nodeService)
          Injects the nodeService bean.
 void setTikaConfig(org.apache.tika.config.TikaConfig tikaConfig)
          Injects the TikaConfig to use
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TikaPoweredContainerExtractor

public TikaPoweredContainerExtractor()
Method Detail

setNodeService

public void setNodeService(org.alfresco.service.cmr.repository.NodeService nodeService)
Injects the nodeService bean.

Parameters:
nodeService - the nodeService.

setContentService

public void setContentService(ContentService contentService)
Injects the contentService bean.

Parameters:
contentService - the contentService.

setTikaConfig

public void setTikaConfig(org.apache.tika.config.TikaConfig tikaConfig)
Injects the TikaConfig to use

Parameters:
tikaConfig - The Tika Config to use

extract

public java.util.List extract(org.alfresco.service.cmr.repository.NodeRef source,
                              java.util.List mimetypes)
Extracts out all the entries from the container that match the supplied list of mime types. If no mime types are specified, extracts all available embedded resources.



Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.