regexp_learner.gold package¶

Submodules¶

regexp_learner.gold.gold module¶

gold(s_plus: iter, s_minus: iter, sigma: str = 'abcdefghijklmnopqrstuvwxyz0123456789 ', red_states: set = {''}, fill_holes: bool = False, blue_state_choice_func: callable = <built-in function min>, red_state_choice_func: callable = <built-in function min>, verbose: bool = False) → tuple[source]¶

Runs the GOLD algorithm.

Parameters:

s_plus (iter) – An iterable of strings that are present in the language to infer.
s_minus (iter) – An iterable of strings that are not present in the language to infer.
sigma (str) – An iterable of chars, represents the alphabet.
red_states (set) – An iterable of strings, should remain default to run Gold algorithm.
fill_holes (bool) – If True, the function uses the filling holes method. If False, the function won’t fill holes in the table, but search for compatible successors when building the automaton.
blue_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which blue state to promote among the candidates.
red_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which red state to choose among the red_states which are compatible with a blue one.
verbose (bool) – Pass True to output in HTML the important steps of the algorithm.

Returns:

g is the inferred Automaton; success equals True iff the algorithm succeeded If success equals False, then g is the Prefix Tree Acceptor (PTA) accepting s_plus.

Return type:

A tuple (g, success) where

regexp_learner.gold.observation_table module¶

class GoldObservationTable(s_plus: set, s_minus: set, sigma: str = 'abcdefghijklmnopqrstuvwxyz0123456789 ', red_states: set = {''}, fill_holes: bool = False, blue_state_choice_func: callable = <built-in function min>, red_state_choice_func: callable = <built-in function min>)[source]¶

Bases: object

The GoldObservationTable class implements the observation table used in the Gold algorithm.

Constructor.

Parameters:

s_plus (iter) – An iterable of strings that are present in the language to infer.
s_minus (iter) – An iterable of strings that are not present in the language to infer.
sigma (str) – An iterable of chars, represents the alphabet.
red_states (set) – An iterable of strings, should remain default to run Gold algorithm.
fill_holes (bool) – If True, the function uses the filling holes method. If False, the function won’t fill holes in the table, but search for compatible successors when building the automaton.
blue_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which blue state to promote among the candidates.
red_state_choice_func (callable) – A Iterable[str] -> str function, used to choose which red state to choose among the red_states which are compatible with a blue one.
show_tables_as_html (bool) – Pass True to output in HTML the important steps of the algorithm.

ONE = 1¶

STAR = '*'¶

ZERO = 0¶

static are_obviously_different(row1: list, row2: list) → bool[source]¶

Checks whether two rows are obviously different.

Parameters:

row1 (list) – A row of this GoldObservationTable.
row2 (list) – A row of this GoldObservationTable.

Returns:

True iff one of these two row contains at least one ONE and the other row contains at least one ZERO at a given index.

static check_input_consistency(s_plus, s_minus, sigma, red_states)[source]¶

Checks that the input given to build an observation table is consistent.

Parameters:

s_plus (iter) – An iterable of strings that are present in the language to infer.
s_minus (iter) – An iterable of strings that are not present in the language to infer.
sigma (str) – An iterable of chars, represents the alphabet.
red_states (set) – An iterable of strings, should remain default to run Gold algorithm.

Raises:

A RuntimeError exception if the input data is not consistent. –

choose_compatible_red_state(row)[source]¶

Finds a red state that is compatible according to a row.

Parameters:: row (list) – A vector of values in {ONE, ZERO, STAR}, corresponding to a blue state.
Returns:: A red state that is compatible (not obviously different)

choose_obviously_different_blue_state() → int[source]¶

Finds a blue state (row) that is obviously different from all the red states.

Returns:: A state (if found), None otherwise.

get_value_from_sample(w: str) → int[source]¶

Returns the value used to fill this GoldObservationTable for a given word.

Parameters:

w (str) – An arbitray word.

Returns:

ONE if w is in self.s_plus,
ZERO if it is in s_minus,
STAR otherwise

is_consistent_with_samples(g: Automaton) → bool[source]¶

Checks if a given automaton complies with the positive and negative examples.

_Remarks:_

Using the pybgl.Automaton implementation, this method always returns True.
If the automaton class supports rejecting states, GoldObservationTable.is_consistent_with_samples() should be overloaded and check whether g is consistent with self.s_plus (positive examples) and with self.s_minus (negative examples).

Parameters:: g (Automaton) – An automaton instance.
Returns:: True if g accepts the positive examples and rejects the negative examples, False otherwise.

make_pta() → Trie[source]¶

Builds the PTA (Prefix Tree Acceptor) corresponding to the positive examples related to this GoldObservationTable instance.

Returns:: The corresponding pybgl.Trie instance.

to_automaton() → tuple[source]¶

Builds an automaton from the observation table information.

Returns:: g is the inferred pybgl.Automaton; success equals True iff the algorithm succeeded in building an automaton. If False, g is the Prefix Tree Acceptor (PTA) accepting s_plus.
Return type:: A tuple (g, success) where

to_html() → str[source]¶

Exports to HTML this GoldObservationTable instance.

Returns:: The corresponding HTML string.

try_and_fill_holes()[source]¶

Tries to fill all the holes (STAR) that are in the observation table after the promoting phase.

Returns:: True if it succeeds, False otherwise.

try_and_promote_blue() → bool[source]¶

Tries to find a blue state to promote (cf Gold algorithm). If such a state is found, the function promotes it and updates this GoldObservationTable accordingly.

Returns:: True iff a state has been promoted, False otherwise.

regexp_learner.gold package¶

Submodules¶

regexp_learner.gold.gold module¶

regexp_learner.gold.observation_table module¶

Module contents¶