|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
public abstract class AbstractMappingMetadataExtracter
Support class for metadata extracters that support dynamic and config-driven mapping between extracted values and model properties. Extraction is broken up into two phases:
Migrating an existing extracter to use this class is straightforward:
extractInternal
method. This now returns a raw map of extracted
values keyed by document-specific property names. The trimPut method has
been replaced with an equivalent AbstractMappingMetadataExtracter.putRawValue(String, Serializable, Map)
.
AbstractMappingMetadataExtracter.getDefaultMapping()
method. The simplest
is to provide the default mapping in a correlated .properties file.
AbstractMappingMetadataExtracter.getDefaultMapping()
,
AbstractMappingMetadataExtracter.extractRaw(ContentReader)
,
AbstractMappingMetadataExtracter.setMapping(Map)
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface org.alfresco.repo.content.metadata.MetadataExtracter |
---|
MetadataExtracter.OverwritePolicy |
Field Summary | |
---|---|
protected static org.apache.commons.logging.Log |
logger
|
static java.lang.String |
NAMESPACE_PROPERTY_PREFIX
|
protected java.util.Set |
supportedDateFormats
|
Constructor Summary | |
---|---|
protected |
AbstractMappingMetadataExtracter()
Default constructor. |
protected |
AbstractMappingMetadataExtracter(java.util.Set supportedMimetypes)
Constructor that can be used when the list of supported mimetypes is known up front. |
Method Summary | |
---|---|
protected void |
checkIsSupported(org.alfresco.service.cmr.repository.ContentReader reader)
Checks if the mimetype is supported. |
java.util.Map |
extract(org.alfresco.service.cmr.repository.ContentReader reader,
java.util.Map destination)
Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. |
java.util.Map |
extract(org.alfresco.service.cmr.repository.ContentReader reader,
MetadataExtracter.OverwritePolicy overwritePolicy,
java.util.Map destination)
Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. |
java.util.Map |
extract(org.alfresco.service.cmr.repository.ContentReader reader,
MetadataExtracter.OverwritePolicy overwritePolicy,
java.util.Map destination,
java.util.Map mapping)
Extracts the metadata from the content provided by the reader and source mimetype to the supplied map. |
protected abstract java.util.Map |
extractRaw(org.alfresco.service.cmr.repository.ContentReader reader)
Override to provide the raw extracted metadata values. |
protected void |
filterSystemProperties(java.util.Map systemProperties,
java.util.Map targetProperties)
Filters the system properties that are going to be applied. |
protected java.util.Map |
getDefaultMapping()
This method provides a best guess of where to store the values extracted from the documents. |
long |
getExtractionTime()
Provides an estimate, usually a worst case guess, of how long an extraction will take. |
protected java.util.Map |
getMapping()
Helper method for derived classes to obtain the mappings that will be applied to raw values. |
protected org.alfresco.service.cmr.repository.MimetypeService |
getMimetypeService()
|
double |
getReliability(java.lang.String mimetype)
TODO - This doesn't appear to be used, so should be removed / deprecated / replaced |
protected void |
init()
Provides a hook point for implementations to perform initialization. |
boolean |
isSupported(java.lang.String sourceMimetype)
Determines if the extracter works against the given mimetype. |
protected java.util.Date |
makeDate(java.lang.String dateStr)
Convert a date String to a Date object |
protected java.util.Map |
newRawMap()
Helper method to fetch a clean map into which raw values can be dumped. |
protected boolean |
putRawValue(java.lang.String key,
java.io.Serializable value,
java.util.Map destination)
Adds a value to the map, conserving null values. |
protected java.util.Map |
readMappingProperties(java.util.Properties mappingProperties)
A utility method to convert mapping properties to the Map form. |
protected java.util.Map |
readMappingProperties(java.lang.String propertiesUrl)
A utility method to read mapping properties from a resource file and convert to the map form. |
void |
register()
Registers this instance of the extracter with the registry. |
void |
setDictionaryService(org.alfresco.service.cmr.dictionary.DictionaryService dictionaryService)
|
void |
setFailOnTypeConversion(boolean failOnTypeConversion)
Set whether the extractor should discard metadata that fails to convert to the target type defined in the data dictionary model. |
void |
setInheritDefaultMapping(boolean inheritDefaultMapping)
Set if the property mappings augment or override the mapping generically provided by the extracter implementation. |
void |
setMapping(java.util.Map mapping)
Set the mapping from document metadata to system metadata. |
void |
setMappingProperties(java.util.Properties mappingProperties)
Set the properties that contain the mapping from document metadata to system metadata. |
void |
setMimetypeService(org.alfresco.service.cmr.repository.MimetypeService mimetypeService)
|
void |
setOverwritePolicy(MetadataExtracter.OverwritePolicy overwritePolicy)
Set the policy to use when existing values are encountered. |
void |
setOverwritePolicy(java.lang.String overwritePolicyStr)
Set the policy to use when existing values are encountered. |
void |
setRegistry(MetadataExtracterRegistry registry)
Set the registry to register with. |
void |
setSupportedDateFormats(java.util.List supportedDateFormats)
Set the date formats, over and above the ISO8601 format , that will
be supported for string to date conversions. |
void |
setSupportedMimetypes(java.util.Collection supportedMimetypes)
Set the mimetypes that are supported by the extracter. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String NAMESPACE_PROPERTY_PREFIX
protected static org.apache.commons.logging.Log logger
protected java.util.Set supportedDateFormats
Constructor Detail |
---|
protected AbstractMappingMetadataExtracter()
AbstractMappingMetadataExtracter.isSupported(String)
should
be implemented. This is useful when the list of supported mimetypes is not known
when the instance is constructed. Alternatively, once the set becomes known, call
AbstractMappingMetadataExtracter.setSupportedMimetypes(Collection)
.
AbstractMappingMetadataExtracter.isSupported(String)
,
AbstractMappingMetadataExtracter.setSupportedMimetypes(Collection)
protected AbstractMappingMetadataExtracter(java.util.Set supportedMimetypes)
supportedMimetypes
- the set of mimetypes supported by defaultMethod Detail |
---|
public void setRegistry(MetadataExtracterRegistry registry)
registry
- a metadata extracter registrypublic void setMimetypeService(org.alfresco.service.cmr.repository.MimetypeService mimetypeService)
mimetypeService
- the mimetype service. Set this if required.protected org.alfresco.service.cmr.repository.MimetypeService getMimetypeService()
public void setDictionaryService(org.alfresco.service.cmr.dictionary.DictionaryService dictionaryService)
dictionaryService
- the dictionary service to determine which data conversions are necessarypublic void setSupportedMimetypes(java.util.Collection supportedMimetypes)
supportedMimetypes
- public boolean isSupported(java.lang.String sourceMimetype)
isSupported
in interface MetadataExtracter
sourceMimetype
- the document mimetype
AbstractMappingMetadataExtracter.setSupportedMimetypes(Collection)
public double getReliability(java.lang.String mimetype)
getReliability
in interface MetadataExtracter
mimetype
- the mimetype to check
1.0
if the mimetype is supported, otherwise 0.0AbstractMappingMetadataExtracter.isSupported(String)
public void setOverwritePolicy(MetadataExtracter.OverwritePolicy overwritePolicy)
overwritePolicy
- the policy to apply when there are existing system propertiespublic void setOverwritePolicy(java.lang.String overwritePolicyStr)
overwritePolicyStr
- the policy to apply when there are existing system propertiespublic void setFailOnTypeConversion(boolean failOnTypeConversion)
failOnTypeConversion
- false to discard properties that can't get converted
to the dictionary-defined type, or true (default)
to fail the extraction if the type doesn't convertpublic void setSupportedDateFormats(java.util.List supportedDateFormats)
ISO8601 format
, that will
be supported for string to date conversions. The supported syntax is described by the
SimpleDateFormat Javadocs
.
supportedDateFormats
- a list of supported date formats.public void setInheritDefaultMapping(boolean inheritDefaultMapping)
default mappings
.
inheritDefaultMapping
- true to add the configured mapping
to the list of default mappings.AbstractMappingMetadataExtracter.getDefaultMapping()
,
AbstractMappingMetadataExtracter.setMapping(Map)
,
AbstractMappingMetadataExtracter.setMappingProperties(Properties)
public void setMapping(java.util.Map mapping)
default converter
.
mapping
- a mapping from document metadata to system metadatapublic void setMappingProperties(java.util.Properties mappingProperties)
AbstractMappingMetadataExtracter.setMapping(Map)
method. Any mappings already
present will be cleared out.
The property mapping is of the form:
# Namespaces prefixes namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 namespace.prefix.my=http://www....com/alfresco/1.0 # Mapping editor=cm:author, my:editor title=cm:title user1=cm:summary user2=cm:descriptionThe mapping can therefore be from a single document property onto several system properties.
mappingProperties
- the properties that map document properties to system propertiesprotected final java.util.Map getMapping()
Normally, the list of properties that can be extracted from a document is fixed and well-known - in that case, just extract everything. But Some implementations may have an extra, indeterminate set of values available for extraction. If the extraction of these runtime parameters is expensive, then the keys provided by the return value can be used to extract values from the documents. The metadata extraction becomes fully configuration-driven, i.e. declaring further mappings will result in more values being extracted from the documents.
Most extractors will not be using this method. For an example of its use, see the OpenDocument extractor, which uses the mapping to select specific user properties from a document.
protected java.util.Map readMappingProperties(java.lang.String propertiesUrl)
propertiesUrl
- A standard Properties file URL locationAbstractMappingMetadataExtracter.setMappingProperties(Properties)
protected java.util.Map readMappingProperties(java.util.Properties mappingProperties)
AbstractMappingMetadataExtracter.setMappingProperties(Properties)
public final void register()
AbstractMappingMetadataExtracter.init()
method and then register if the registry is available.
AbstractMappingMetadataExtracter.setRegistry(MetadataExtracterRegistry)
,
AbstractMappingMetadataExtracter.init()
protected void init()
default mappings
will be requested during
initialization.
public long getExtractionTime()
This method is used to determine, up front, which of a set of equally reliant transformers will be used for a specific extraction.
getExtractionTime
in interface MetadataExtracter
protected void checkIsSupported(org.alfresco.service.cmr.repository.ContentReader reader)
reader
- the reader to check
org.alfresco.error.AlfrescoRuntimeException
- if the mimetype is not supportedpublic final java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, java.util.Map destination)
overwrite policy
between document metadata and system metadata will be used.
The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String)
.
The source mimetype must be available on the
ContentAccessor.getMimetype()
method
of the reader.
extract
in interface MetadataExtracter
reader
- the source of the contentdestination
- the map of properties to populate (essentially a return value)
MetadataExtracter.extract(ContentReader, OverwritePolicy, Map, Map)
public final java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, java.util.Map destination)
The extraction viability can be determined by an up front call to MetadataExtracter.isSupported(String)
.
The source mimetype must be available on the
ContentAccessor.getMimetype()
method
of the reader.
extract
in interface MetadataExtracter
reader
- the source of the contentoverwritePolicy
- the policy stipulating how the system properties must be
overwritten if presentdestination
- the map of properties to populate (essentially a return value)
MetadataExtracter.extract(ContentReader, OverwritePolicy, Map, Map)
public java.util.Map extract(org.alfresco.service.cmr.repository.ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, java.util.Map destination, java.util.Map mapping)
overwrite policy
is also explictly
set.
The extraction viability can be determined by an up front call to
MetadataExtracter.isSupported(String)
.
The source mimetype must be available on the
ContentAccessor.getMimetype()
method
of the reader.
extract
in interface MetadataExtracter
reader
- the source of the contentoverwritePolicy
- the policy stipulating how the system properties must be
overwritten if presentdestination
- the map of properties to populate (essentially a return value)mapping
- a mapping of document-specific properties to system properties.
MetadataExtracter.extract(ContentReader, Map)
protected void filterSystemProperties(java.util.Map systemProperties, java.util.Map targetProperties)
systemProperties
- map of system properties to be appliedtargetProperties
- map of target properties, may be used to provide to the context requriedprotected java.util.Date makeDate(java.lang.String dateStr)
protected boolean putRawValue(java.lang.String key, java.io.Serializable value, java.util.Map destination)
key
- the destination keyvalue
- the serializable valuedestination
- the map to put values into
protected final java.util.Map newRawMap()
protected java.util.Map getDefaultMapping()
The default implementation looks for the default mapping file in the location given by the class name and .properties. If the extracter's class is x.y.z.MyExtracter then the default properties will be picked up at classpath:/x/y/z/MyExtracter.properties. Inner classes are supported, but the '$' in the class name is replaced with '-', so default properties for x.y.z.MyStuff$MyExtracter will be located using x.y.z.MyStuff-MyExtracter.properties.
The default mapping implementation should include thorough Javadocs so that the system administrators can accurately determine how to best enhance or override the default mapping.
If the default mapping is declared in a properties file other than the one named after
the class, then the AbstractMappingMetadataExtracter.readMappingProperties(String)
method can be used to quickly
generate the return value:
protected Map<> getDefaultMapping()
{
return readMappingProperties(DEFAULT_MAPPING);
}
The map can also be created in code either statically or during the call.
AbstractMappingMetadataExtracter.setInheritDefaultMapping(boolean inherit)
protected abstract java.util.Map extractRaw(org.alfresco.service.cmr.repository.ContentReader reader) throws java.lang.Throwable
default mapping
doesn't handle all properties, it is
possible for each instance of the extracter to be configured differently and more or
less of the properties may be used in different installations.
Raw values must not be trimmed or removed for any reason. Null values and empty strings are
OverwritePolicy
Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:
editor: - the document editor --> cm:author title: - the document title --> cm:title user1: - the document summary user2: - the document description --> cm:description user3: - user4: -
reader
- the document to extract the values from. This stream provided by
the reader must be closed if accessed directly.
org.apache.xmlbeans.impl.xb.xsdschema.All
- exception conditions can be handled.
java.lang.Throwable
AbstractMappingMetadataExtracter.getDefaultMapping()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |