org.alfresco.repo.content.transform
Class TextMiningContentTransformer
java.lang.Object
org.alfresco.repo.content.transform.ContentTransformerHelper
org.alfresco.repo.content.transform.AbstractContentTransformer2
org.alfresco.repo.content.transform.TextMiningContentTransformer
- All Implemented Interfaces:
- ContentWorker, ContentTransformer
public class TextMiningContentTransformer
- extends AbstractContentTransformer2
This badly named transformer turns Microsoft Word documents
(Word 6, 95, 97, 2000, 2003) into plain text.
Doesn't currently use Apache Tika to
do this, pending TIKA-408. When Apache POI 3.7 beta 2 has been
released, we can switch to Tika and then handle Word 6,
Word 95, Word 97, 2000, 2003, 2007 and 2010 formats.
TODO Switch to Tika in August 2010
|
Method Summary |
boolean |
isTransformable(java.lang.String sourceMimetype,
java.lang.String targetMimetype,
TransformationOptions options)
Currently the only transformation performed is that of text extraction from Word documents. |
void |
transformInternal(org.alfresco.service.cmr.repository.ContentReader reader,
org.alfresco.service.cmr.repository.ContentWriter writer,
TransformationOptions options)
Method to be implemented by subclasses wishing to make use of the common infrastructural code
provided by this class. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
TextMiningContentTransformer
public TextMiningContentTransformer()
isTransformable
public boolean isTransformable(java.lang.String sourceMimetype,
java.lang.String targetMimetype,
TransformationOptions options)
- Currently the only transformation performed is that of text extraction from Word documents.
- Parameters:
sourceMimetype - the source mimetypeoptions - the transformation options
- Returns:
- boolean true if this content transformer can satify the mimetypes and options specified, false otherwise
transformInternal
public void transformInternal(org.alfresco.service.cmr.repository.ContentReader reader,
org.alfresco.service.cmr.repository.ContentWriter writer,
TransformationOptions options)
throws java.lang.Exception
- Description copied from class:
AbstractContentTransformer2
- Method to be implemented by subclasses wishing to make use of the common infrastructural code
provided by this class.
- Specified by:
transformInternal in class AbstractContentTransformer2
- Parameters:
reader - the source of the content to transformwriter - the target to which to write the transformed contentoptions - a map of options to use when performing the transformation. The map
will never be null.
- Throws:
java.lang.Exception - exceptions will be handled by this class - subclasses can throw anything
Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.