org.alfresco.repo.content.transform
Class ArchiveContentTransformer

java.lang.Object
  extended by org.alfresco.repo.content.transform.ContentTransformerHelper
      extended by org.alfresco.repo.content.transform.AbstractContentTransformer2
          extended by org.alfresco.repo.content.transform.TikaPoweredContentTransformer
              extended by org.alfresco.repo.content.transform.ArchiveContentTransformer
All Implemented Interfaces:
ContentWorker, ContentTransformer

public class ArchiveContentTransformer
extends TikaPoweredContentTransformer

This class transforms archive files (zip, tar etc) to text, which enables indexing and searching of archives as well as webpreviewing. The transformation can simply list the names of the entries within the archive, or it can also include the textual content of the entries themselves. The former is suggested for web preview, the latter for indexing. This behaviour is controlled by the recurse flag.

Since:
3.4

Field Summary
static java.util.ArrayList SUPPORTED_MIMETYPES
          We support all the archive mimetypes that the Tika package parser can handle
 
Fields inherited from class org.alfresco.repo.content.transform.TikaPoweredContentTransformer
LINE_BREAK, sourceMimeTypes, WRONG_FORMAT_MESSAGE_ID
 
Constructor Summary
ArchiveContentTransformer()
           
 
Method Summary
protected  org.apache.tika.parser.ParseContext buildParseContext(org.apache.tika.metadata.Metadata metadata, java.lang.String targetMimeType, TransformationOptions options)
          By default returns a ParseContent that does not recurse
protected  org.apache.tika.parser.Parser getParser()
          Returns the correct Tika Parser to process the document.
 void setIncludeContents(java.lang.String includeContents)
           
 void setTikaConfig(org.apache.tika.config.TikaConfig tikaConfig)
          Injects the TikaConfig to use
 
Methods inherited from class org.alfresco.repo.content.transform.TikaPoweredContentTransformer
getContentHandler, isTransformable, transformInternal
 
Methods inherited from class org.alfresco.repo.content.transform.AbstractContentTransformer2
checkTransformable, getTransformationTime, recordTime, register, setRegistry, toString, transform, transform, transform
 
Methods inherited from class org.alfresco.repo.content.transform.ContentTransformerHelper
getMimetype, getMimetypeService, isExplicitTransformation, setExplicitTransformations, setMimetypeService
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.alfresco.repo.content.transform.ContentTransformer
isExplicitTransformation
 

Field Detail

SUPPORTED_MIMETYPES

public static java.util.ArrayList SUPPORTED_MIMETYPES
We support all the archive mimetypes that the Tika package parser can handle

Constructor Detail

ArchiveContentTransformer

public ArchiveContentTransformer()
Method Detail

setTikaConfig

public void setTikaConfig(org.apache.tika.config.TikaConfig tikaConfig)
Injects the TikaConfig to use

Parameters:
tikaConfig - The Tika Config to use

setIncludeContents

public void setIncludeContents(java.lang.String includeContents)

getParser

protected org.apache.tika.parser.Parser getParser()
Description copied from class: TikaPoweredContentTransformer
Returns the correct Tika Parser to process the document. If you don't know which you want, use TikaAutoContentTransformer which makes use of the Tika auto-detection.

Specified by:
getParser in class TikaPoweredContentTransformer

buildParseContext

protected org.apache.tika.parser.ParseContext buildParseContext(org.apache.tika.metadata.Metadata metadata,
                                                                java.lang.String targetMimeType,
                                                                TransformationOptions options)
Description copied from class: TikaPoweredContentTransformer
By default returns a ParseContent that does not recurse

Overrides:
buildParseContext in class TikaPoweredContentTransformer


Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.