Class DictionaryMetadata

java.lang.Object
morfologik.stemming.DictionaryMetadata

public final class DictionaryMetadata extends Object
Description of attributes, their types and default values.
  • Field Details

    • METADATA_FILE_EXTENSION

      public static final String METADATA_FILE_EXTENSION
      Expected metadata file extension.
      See Also:
  • Constructor Details

  • Method Details

    • getAttributes

      public Map<DictionaryAttribute, String> getAttributes()
      Returns:
      Return all metadata attributes.
    • getEncoding

      public String getEncoding()
    • getSeparator

      public byte getSeparator()
    • getLocale

      public Locale getLocale()
    • getInputConversionPairs

      public LinkedHashMap<String,String> getInputConversionPairs()
    • getOutputConversionPairs

      public LinkedHashMap<String,String> getOutputConversionPairs()
    • getReplacementPairs

      public LinkedHashMap<String, List<String>> getReplacementPairs()
    • getEquivalentChars

      public LinkedHashMap<Character, List<Character>> getEquivalentChars()
    • isFrequencyIncluded

      public boolean isFrequencyIncluded()
    • isIgnoringPunctuation

      public boolean isIgnoringPunctuation()
    • isIgnoringNumbers

      public boolean isIgnoringNumbers()
    • isIgnoringCamelCase

      public boolean isIgnoringCamelCase()
    • isIgnoringAllUppercase

      public boolean isIgnoringAllUppercase()
    • isIgnoringDiacritics

      public boolean isIgnoringDiacritics()
    • isConvertingCase

      public boolean isConvertingCase()
    • isSupportingRunOnWords

      public boolean isSupportingRunOnWords()
    • getDecoder

      public CharsetDecoder getDecoder()
      Returns:
      Returns a new CharsetDecoder for the encoding.
    • getEncoder

      public CharsetEncoder getEncoder()
      Returns:
      Returns a new CharsetEncoder for the encoding.
    • getSequenceEncoderType

      public EncoderType getSequenceEncoderType()
      Returns:
      Return sequence encoder type.
    • getSeparatorAsChar

      public char getSeparatorAsChar()
      Returns:
      Returns the separator byte converted to a single char.
      Throws:
      RuntimeException - if this conversion is for some reason impossible (the byte is a surrogate pair, FSA's encoding is not available).
    • builder

      public static DictionaryMetadataBuilder builder()
      Returns:
      A shortcut returning DictionaryMetadataBuilder.
    • getExpectedMetadataFileName

      public static String getExpectedMetadataFileName(String dictionaryFile)
      Returns the expected name of the metadata file, based on the name of the dictionary file. The expected name is resolved by truncating any file extension of name and appending METADATA_FILE_EXTENSION.
      Parameters:
      dictionaryFile - The name of the dictionary (*.dict) file.
      Returns:
      Returns the expected name of the metadata file.
    • getExpectedMetadataLocation

      public static Path getExpectedMetadataLocation(Path dictionary)
      Parameters:
      dictionary - The location of the dictionary file.
      Returns:
      Returns the expected location of a metadata file.
    • read

      public static DictionaryMetadata read(InputStream metadataStream) throws IOException
      Read dictionary metadata from a property file (stream).
      Parameters:
      metadataStream - The stream with metadata.
      Returns:
      Returns DictionaryMetadata read from a the stream (property file).
      Throws:
      IOException - Thrown if an I/O exception occurs.
    • write

      public void write(Writer writer) throws IOException
      Write dictionary attributes (metadata).
      Parameters:
      writer - The writer to write to.
      Throws:
      IOException - Thrown when an I/O error occurs.