Class FSABuilder

java.lang.Object
morfologik.fsa.builders.FSABuilder

public final class FSABuilder extends Object
Fast, memory-conservative finite state automaton builder, returning an in-memory FSA that is a tradeoff between construction speed and memory consumption. Use serializers to compress the returned automaton into more compact form.
See Also:
  • Field Details

    • LEXICAL_ORDERING

      public static final Comparator<byte[]> LEXICAL_ORDERING
      A comparator comparing full byte arrays. Unsigned byte comparisons ('C'-locale).
  • Constructor Details

    • FSABuilder

      public FSABuilder()
    • FSABuilder

      public FSABuilder(int bufferGrowthSize)
      Parameters:
      bufferGrowthSize - Buffer growth size (in bytes) when constructing the automaton.
  • Method Details

    • add

      public void add(byte[] sequence, int start, int len)
      Add a single sequence of bytes to the FSA. The input must be lexicographically greater than any previously added sequence.
      Parameters:
      sequence - The array holding input sequence of bytes.
      start - Starting offset (inclusive)
      len - Length of the input sequence (at least 1 byte).
    • complete

      public FSA complete()
      Returns:
      Finalizes the construction of the automaton and returns it.
    • build

      public static FSA build(byte[][] input)
      Build a minimal, deterministic automaton from a sorted list of byte sequences.
      Parameters:
      input - Input sequences to build automaton from.
      Returns:
      Returns the automaton encoding all input sequences.
    • build

      public static FSA build(Iterable<byte[]> input)
      Build a minimal, deterministic automaton from an iterable list of byte sequences.
      Parameters:
      input - Input sequences to build automaton from.
      Returns:
      Returns the automaton encoding all input sequences.
    • getInfo

      public Map<FSABuilder.InfoEntry, Object> getInfo()
      Returns:
      Returns various statistics concerning the FSA and its compilation.
      See Also: