regexp_learner.gold package

Submodules

regexp_learner.gold.gold module

gold(s_plus: iter, s_minus: iter, sigma: str = 'abcdefghijklmnopqrstuvwxyz0123456789 ', red_states: set = {''}, fill_holes: bool = False, blue_state_choice_func: callable = <built-in function min>, red_state_choice_func: callable = <built-in function min>, verbose: bool = False) tuple[source]

Runs the GOLD algorithm.

Parameters:
  • s_plus (iter) – An iterable of strings that are present in the language to infer.

  • s_minus (iter) – An iterable of strings that are not present in the language to infer.

  • sigma (str) – An iterable of chars, represents the alphabet.

  • red_states (set) – An iterable of strings, should remain default to run Gold algorithm.

  • fill_holes (bool) – If True, the function uses the filling holes method. If False, the function won’t fill holes in the table, but search for compatible successors when building the automaton.

  • blue_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which blue state to promote among the candidates.

  • red_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which red state to choose among the red_states which are compatible with a blue one.

  • verbose (bool) – Pass True to output in HTML the important steps of the algorithm.

Returns:

g is the inferred Automaton; success equals True iff the algorithm succeeded If success equals False, then g is the Prefix Tree Acceptor (PTA) accepting s_plus.

Return type:

A tuple (g, success) where

regexp_learner.gold.observation_table module

class GoldObservationTable(s_plus: set, s_minus: set, sigma: str = 'abcdefghijklmnopqrstuvwxyz0123456789 ', red_states: set = {''}, fill_holes: bool = False, blue_state_choice_func: callable = <built-in function min>, red_state_choice_func: callable = <built-in function min>)[source]

Bases: object

The GoldObservationTable class implements the observation table used in the Gold algorithm.

Constructor.

Parameters:
  • s_plus (iter) – An iterable of strings that are present in the language to infer.

  • s_minus (iter) – An iterable of strings that are not present in the language to infer.

  • sigma (str) – An iterable of chars, represents the alphabet.

  • red_states (set) – An iterable of strings, should remain default to run Gold algorithm.

  • fill_holes (bool) – If True, the function uses the filling holes method. If False, the function won’t fill holes in the table, but search for compatible successors when building the automaton.

  • blue_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which blue state to promote among the candidates.

  • red_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which red state to choose among the red_states which are compatible with a blue one.

  • show_tables_as_html (bool) – Pass True to output in HTML the important steps of the algorithm.

ONE = 1
STAR = '*'
ZERO = 0
static are_obviously_different(row1: list, row2: list) bool[source]

Checks whether two rows are obviously different.

Parameters:
Returns:

True iff one of these two row contains at least one ONE and the other row contains at least one ZERO at a given index.

static check_input_consistency(s_plus, s_minus, sigma, red_states)[source]

Checks that the input given to build an observation table is consistent.

Parameters:
  • s_plus (iter) – An iterable of strings that are present in the language to infer.

  • s_minus (iter) – An iterable of strings that are not present in the language to infer.

  • sigma (str) – An iterable of chars, represents the alphabet.

  • red_states (set) – An iterable of strings, should remain default to run Gold algorithm.

Raises:

A RuntimeError exception if the input data is not consistent.

choose_compatible_red_state(row)[source]

Finds a red state that is compatible according to a row.

Parameters:

row (list) – A vector of values in {ONE, ZERO, STAR}, corresponding to a blue state.

Returns:

A red state that is compatible (not obviously different)

choose_obviously_different_blue_state() int[source]

Finds a blue state (row) that is obviously different from all the red states.

Returns:

A state (if found), None otherwise.

get_value_from_sample(w: str) int[source]

Returns the value used to fill this GoldObservationTable for a given word.

Parameters:

w (str) – An arbitray word.

Returns:

  • ONE if w is in self.s_plus,

  • ZERO if it is in s_minus,

  • STAR otherwise

is_consistent_with_samples(g: Automaton) bool[source]

Checks if a given automaton complies with the positive and negative examples.

_Remarks:_

  • Using the pybgl.Automaton implementation, this method always returns True.

  • If the automaton class supports rejecting states, GoldObservationTable.is_consistent_with_samples() should be overloaded and check whether g is consistent with self.s_plus (positive examples) and with self.s_minus (negative examples).

Parameters:

g (Automaton) – An automaton instance.

Returns:

True if g accepts the positive examples and rejects the negative examples, False otherwise.

make_pta() Trie[source]

Builds the PTA (Prefix Tree Acceptor) corresponding to the positive examples related to this GoldObservationTable instance.

Returns:

The corresponding pybgl.Trie instance.

to_automaton() tuple[source]

Builds an automaton from the observation table information.

Returns:

g is the inferred pybgl.Automaton; success equals True iff the algorithm succeeded in building an automaton. If False, g is the Prefix Tree Acceptor (PTA) accepting s_plus.

Return type:

A tuple (g, success) where

to_html() str[source]

Exports to HTML this GoldObservationTable instance.

Returns:

The corresponding HTML string.

try_and_fill_holes()[source]

Tries to fill all the holes (STAR) that are in the observation table after the promoting phase.

Returns:

True if it succeeds, False otherwise.

try_and_promote_blue() bool[source]

Tries to find a blue state to promote (cf Gold algorithm). If such a state is found, the function promotes it and updates this GoldObservationTable accordingly.

Returns:

True iff a state has been promoted, False otherwise.

Module contents