org.alfresco.repo.content.metadata
Class OfficeMetadataExtracter

java.lang.Object
  extended by org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
      extended by org.alfresco.repo.content.metadata.OfficeMetadataExtracter
All Implemented Interfaces:
ContentWorker, MetadataExtracter

public class OfficeMetadataExtracter
extends AbstractMappingMetadataExtracter

Office file format Metadata Extracter. This extracter uses the POI library to extract the following:

   author:             --      cm:author
   title:              --      cm:title
   subject:            --      cm:description
   createDateTime:     --      cm:created
   lastSaveDateTime:   --      cm:modified
   comments:
   editTime:
   format:
   keywords:
   lastAuthor:
   lastPrinted:
   osVersion:
   thumbnail:
   pageCount:
   wordCount:
 
TIKA Note - everything we currently have should be present in the metadata.


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.alfresco.repo.content.metadata.MetadataExtracter
MetadataExtracter.OverwritePolicy
 
Field Summary
static java.lang.String KEY_AUTHOR
           
static java.lang.String KEY_COMMENTS
           
static java.lang.String KEY_CREATE_DATETIME
           
static java.lang.String KEY_EDIT_TIME
           
static java.lang.String KEY_FORMAT
           
static java.lang.String KEY_KEYWORDS
           
static java.lang.String KEY_LAST_AUTHOR
           
static java.lang.String KEY_LAST_PRINTED
           
static java.lang.String KEY_LAST_SAVE_DATETIME
           
static java.lang.String KEY_OS_VERSION
           
static java.lang.String KEY_PAGE_COUNT
           
static java.lang.String KEY_SUBJECT
           
static java.lang.String KEY_THUMBNAIL
           
static java.lang.String KEY_TITLE
           
static java.lang.String KEY_WORD_COUNT
           
static java.lang.String[] SUPPORTED_MIMETYPES
           
 
Fields inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
logger, NAMESPACE_PROPERTY_PREFIX
 
Constructor Summary
OfficeMetadataExtracter()
           
 
Method Summary
protected  java.util.Map extractRaw(ContentReader reader)
          Override to provide the raw extracted metadata values.
 
Methods inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
checkIsSupported, extract, extract, extract, getDefaultMapping, getExtractionTime, getMapping, getMimetypeService, getReliability, init, isSupported, newRawMap, putRawValue, readMappingProperties, readMappingProperties, register, setDictionaryService, setFailOnTypeConversion, setInheritDefaultMapping, setMapping, setMappingProperties, setMimetypeService, setOverwritePolicy, setOverwritePolicy, setRegistry, setSupportedDateFormats, setSupportedMimetypes
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

KEY_AUTHOR

public static final java.lang.String KEY_AUTHOR
See Also:
Constant Field Values

KEY_TITLE

public static final java.lang.String KEY_TITLE
See Also:
Constant Field Values

KEY_SUBJECT

public static final java.lang.String KEY_SUBJECT
See Also:
Constant Field Values

KEY_CREATE_DATETIME

public static final java.lang.String KEY_CREATE_DATETIME
See Also:
Constant Field Values

KEY_LAST_SAVE_DATETIME

public static final java.lang.String KEY_LAST_SAVE_DATETIME
See Also:
Constant Field Values

KEY_COMMENTS

public static final java.lang.String KEY_COMMENTS
See Also:
Constant Field Values

KEY_EDIT_TIME

public static final java.lang.String KEY_EDIT_TIME
See Also:
Constant Field Values

KEY_FORMAT

public static final java.lang.String KEY_FORMAT
See Also:
Constant Field Values

KEY_KEYWORDS

public static final java.lang.String KEY_KEYWORDS
See Also:
Constant Field Values

KEY_LAST_AUTHOR

public static final java.lang.String KEY_LAST_AUTHOR
See Also:
Constant Field Values

KEY_LAST_PRINTED

public static final java.lang.String KEY_LAST_PRINTED
See Also:
Constant Field Values

KEY_OS_VERSION

public static final java.lang.String KEY_OS_VERSION
See Also:
Constant Field Values

KEY_THUMBNAIL

public static final java.lang.String KEY_THUMBNAIL
See Also:
Constant Field Values

KEY_PAGE_COUNT

public static final java.lang.String KEY_PAGE_COUNT
See Also:
Constant Field Values

KEY_WORD_COUNT

public static final java.lang.String KEY_WORD_COUNT
See Also:
Constant Field Values

SUPPORTED_MIMETYPES

public static java.lang.String[] SUPPORTED_MIMETYPES
Constructor Detail

OfficeMetadataExtracter

public OfficeMetadataExtracter()
Method Detail

extractRaw

protected java.util.Map extractRaw(ContentReader reader)
                            throws java.lang.Throwable
Description copied from class: AbstractMappingMetadataExtracter
Override to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if the default mapping doesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations.

Raw values must not be trimmed or removed for any reason. Null values and empty strings are

Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:

 editor: - the document editor        -->  cm:author
 title:  - the document title         -->  cm:title
 user1:  - the document summary
 user2:  - the document description   -->  cm:description
 user3:  -
 user4:  -
 

Specified by:
extractRaw in class AbstractMappingMetadataExtracter
Parameters:
reader - the document to extract the values from. This stream provided by the reader must be closed if accessed directly.
Returns:
Returns a map of document property values keyed by property name.
Throws:
java.lang.Throwable
See Also:
AbstractMappingMetadataExtracter.getDefaultMapping()


Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.