org.alfresco.repo.content.metadata
Interface MetadataExtracter

All Superinterfaces:
ContentWorker
All Known Implementing Classes:
AbstractMappingMetadataExtracter, AbstractMetadataExtracter, DWGMetadataExtracter, HtmlMetadataExtracter, MailMetadataExtracter, MappingMetadataExtracterTest.DummyMappingMetadataExtracter, MP3MetadataExtracter, OfficeMetadataExtracter, OpenDocumentMetadataExtracter, OpenOfficeMetadataExtracter, PdfBoxMetadataExtracter, PoiMetadataExtracter, RFC822MetadataExtracter, TikaAutoMetadataExtracter, TikaPoweredMetadataExtracter, TikaSpringConfiguredMetadataExtracter, XmlMetadataExtracter, XPathMetadataExtracter

public interface MetadataExtracter
extends ContentWorker

Interface for document property extracters.

Please pardon the incorrect spelling of extractor.


Nested Class Summary
static class MetadataExtracter.OverwritePolicy
          A enumeration of functional property overwrite policies.
 
Method Summary
 java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, java.util.Map destination)
          Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.
 java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, java.util.Map destination)
          Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.
 java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, java.util.Map destination, java.util.Map mapping)
          Extracts the metadata from the content provided by the reader and source mimetype to the supplied map.
 long getExtractionTime()
          Deprecated. Generally not useful or used. Extraction is normally specifically configured.
 double getReliability(java.lang.String mimetype)
          Deprecated. This method is replaced by MetadataExtracter.isSupported(String)
 boolean isSupported(java.lang.String mimetype)
          Determines if the extracter works against the given mimetype.
 

Method Detail

getReliability

double getReliability(java.lang.String mimetype)
Deprecated. This method is replaced by MetadataExtracter.isSupported(String)

Get an estimate of the extracter's reliability on a scale from 0.0 to 1.0.

Parameters:
mimetype - the mimetype to check
Returns:
Returns a reliability indicator from 0.0 to 1.0

isSupported

boolean isSupported(java.lang.String mimetype)
Determines if the extracter works against the given mimetype.

Parameters:
mimetype - the document mimetype
Returns:
Returns true if the mimetype is supported, otherwise false.

getExtractionTime

long getExtractionTime()
Deprecated. Generally not useful or used. Extraction is normally specifically configured.

Provides an estimate, usually a worst case guess, of how long an extraction will take.

This method is used to determine, up front, which of a set of equally reliant transformers will be used for a specific extraction.

Returns:
Returns the approximate number of milliseconds per transformation

extract

java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader,
                      java.util.Map destination)
Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. The internal mapping and overwrite policy between document metadata and system metadata will be used.

The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String).

The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

Parameters:
reader - the source of the content
destination - the map of properties to populate (essentially a return value)
Returns:
Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
Throws:
org.alfresco.service.cmr.repository.ContentIOException - if a detectable error occurs
See Also:
MetadataExtracter.extract(ContentReader, OverwritePolicy, Map, Map)

extract

java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader,
                      MetadataExtracter.OverwritePolicy overwritePolicy,
                      java.util.Map destination)
Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.

The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String).

The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

Parameters:
reader - the source of the content
overwritePolicy - the policy stipulating how the system properties must be overwritten if present
destination - the map of properties to populate (essentially a return value)
Returns:
Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
Throws:
org.alfresco.service.cmr.repository.ContentIOException - if a detectable error occurs
See Also:
MetadataExtracter.extract(ContentReader, OverwritePolicy, Map, Map)

extract

java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader,
                      MetadataExtracter.OverwritePolicy overwritePolicy,
                      java.util.Map destination,
                      java.util.Map mapping)
Extracts the metadata from the content provided by the reader and source mimetype to the supplied map. The mapping from document metadata to system metadata is explicitly provided. The overwrite policy is also explictly set.

The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String).

The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

Parameters:
reader - the source of the content
overwritePolicy - the policy stipulating how the system properties must be overwritten if present
destination - the map of properties to populate (essentially a return value)
mapping - a mapping of document-specific properties to system properties.
Returns:
Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
Throws:
org.alfresco.service.cmr.repository.ContentIOException - if a detectable error occurs
See Also:
MetadataExtracter.extract(ContentReader, Map)


Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.