public class TikaOfficeDetectParser
extends java.lang.Object
implements org.apache.tika.parser.Parser
Apache Tika assumes that
you either know exactly what your content is, or that
you'll leave it to auto-detection.
Within Alfresco, we usually do know. However, from time
to time, we don't know if we have one of the old or one
of the new office files (eg .xls and .xlsx).
This class allows automatically selects the appropriate
old (OLE2) or new (OOXML) Tika parser as required.| Constructor and Description |
|---|
TikaOfficeDetectParser() |
| Modifier and Type | Method and Description |
|---|---|
java.util.Set |
getSupportedTypes(org.apache.tika.parser.ParseContext parseContext) |
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
org.apache.tika.metadata.Metadata metadata)
Deprecated.
This method will be removed in Apache Tika 1.0.
|
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext parseContext) |
public java.util.Set getSupportedTypes(org.apache.tika.parser.ParseContext parseContext)
getSupportedTypes in interface org.apache.tika.parser.Parserpublic void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext parseContext)
throws java.io.IOException,
org.xml.sax.SAXException,
org.apache.tika.exception.TikaException
parse in interface org.apache.tika.parser.Parserjava.io.IOExceptionorg.xml.sax.SAXExceptionorg.apache.tika.exception.TikaExceptionpublic void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
org.apache.tika.metadata.Metadata metadata)
throws java.io.IOException,
org.xml.sax.SAXException,
org.apache.tika.exception.TikaException
java.io.IOExceptionorg.xml.sax.SAXExceptionorg.apache.tika.exception.TikaExceptionCopyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.