org.alfresco.repo.content.metadata
Class OfficeMetadataExtracter
java.lang.Object
org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
org.alfresco.repo.content.metadata.OfficeMetadataExtracter
- All Implemented Interfaces:
- ContentWorker, MetadataExtracter
public class OfficeMetadataExtracter
- extends AbstractMappingMetadataExtracter
Office file format Metadata Extracter. This extracter uses the POI library to extract
the following:
author: -- cm:author
title: -- cm:title
subject: -- cm:description
createDateTime: -- cm:created
lastSaveDateTime: -- cm:modified
comments:
editTime:
format:
keywords:
lastAuthor:
lastPrinted:
osVersion:
thumbnail:
pageCount:
wordCount:
TIKA Note - everything we currently have should be present
in the metadata.
Method Summary |
protected java.util.Map |
extractRaw(ContentReader reader)
Override to provide the raw extracted metadata values. |
Methods inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter |
checkIsSupported, extract, extract, extract, getDefaultMapping, getExtractionTime, getMapping, getMimetypeService, getReliability, init, isSupported, newRawMap, putRawValue, readMappingProperties, readMappingProperties, register, setDictionaryService, setFailOnTypeConversion, setInheritDefaultMapping, setMapping, setMappingProperties, setMimetypeService, setOverwritePolicy, setOverwritePolicy, setRegistry, setSupportedDateFormats, setSupportedMimetypes |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
KEY_AUTHOR
public static final java.lang.String KEY_AUTHOR
- See Also:
- Constant Field Values
KEY_TITLE
public static final java.lang.String KEY_TITLE
- See Also:
- Constant Field Values
KEY_SUBJECT
public static final java.lang.String KEY_SUBJECT
- See Also:
- Constant Field Values
KEY_CREATE_DATETIME
public static final java.lang.String KEY_CREATE_DATETIME
- See Also:
- Constant Field Values
KEY_LAST_SAVE_DATETIME
public static final java.lang.String KEY_LAST_SAVE_DATETIME
- See Also:
- Constant Field Values
KEY_COMMENTS
public static final java.lang.String KEY_COMMENTS
- See Also:
- Constant Field Values
KEY_EDIT_TIME
public static final java.lang.String KEY_EDIT_TIME
- See Also:
- Constant Field Values
KEY_FORMAT
public static final java.lang.String KEY_FORMAT
- See Also:
- Constant Field Values
KEY_KEYWORDS
public static final java.lang.String KEY_KEYWORDS
- See Also:
- Constant Field Values
KEY_LAST_AUTHOR
public static final java.lang.String KEY_LAST_AUTHOR
- See Also:
- Constant Field Values
KEY_LAST_PRINTED
public static final java.lang.String KEY_LAST_PRINTED
- See Also:
- Constant Field Values
KEY_OS_VERSION
public static final java.lang.String KEY_OS_VERSION
- See Also:
- Constant Field Values
KEY_THUMBNAIL
public static final java.lang.String KEY_THUMBNAIL
- See Also:
- Constant Field Values
KEY_PAGE_COUNT
public static final java.lang.String KEY_PAGE_COUNT
- See Also:
- Constant Field Values
KEY_WORD_COUNT
public static final java.lang.String KEY_WORD_COUNT
- See Also:
- Constant Field Values
SUPPORTED_MIMETYPES
public static java.lang.String[] SUPPORTED_MIMETYPES
OfficeMetadataExtracter
public OfficeMetadataExtracter()
extractRaw
protected java.util.Map extractRaw(ContentReader reader)
throws java.lang.Throwable
- Description copied from class:
AbstractMappingMetadataExtracter
- Override to provide the raw extracted metadata values. An extracter should extract
as many of the available properties as is realistically possible. Even if the
default mapping
doesn't handle all properties, it is
possible for each instance of the extracter to be configured differently and more or
less of the properties may be used in different installations.
Raw values must not be trimmed or removed for any reason. Null values and empty
strings are
- Null: Removed
- Empty String: Passed to the
OverwritePolicy
- Non Serializable: Converted to String or fails if that is not possible
Properties extracted and their meanings and types should be thoroughly described in
the class-level javadocs of the extracter implementation, for example:
editor: - the document editor --> cm:author
title: - the document title --> cm:title
user1: - the document summary
user2: - the document description --> cm:description
user3: -
user4: -
- Specified by:
extractRaw
in class AbstractMappingMetadataExtracter
- Parameters:
reader
- the document to extract the values from. This stream provided by
the reader must be closed if accessed directly.
- Returns:
- Returns a map of document property values keyed by property name.
- Throws:
java.lang.Throwable
- See Also:
AbstractMappingMetadataExtracter.getDefaultMapping()
Copyright © 2005 - 2010 Alfresco Software, Inc. All Rights Reserved.