#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# factorize a set of tuples in cartesian products

Member Posts: 1
Greetings,

I'd like to have your opinion on how to handle the following problem:

Input: series of log entries that in essence are triplets (source,destination,protocol)

Output:
series of addresses sets
series of protocol sets
series of access rules in the form ({src addr set},{dst addr set},{proto set})

The solution should be optimized primarily in terms of number of access rules and secondarily in terms of of number of sets. Sets can include other sets of same type.

It's ok to have rules with non empty intersections.

There is no point to aggregate neither addresses nor protocols into subnets or ranges.

I intent to use the algorithm on a set of few thousands logs.

Input example:
a->b : x
a->b : y
a->c : x
d->b : y
a->d : x
a->c : z
a->d : z

Output would be:
A1 = {a}
A2 = {b}
A3 = {a,d}
A4 = {b.c}
A5 = {c,d}

P1 = {x}
P2 = {y}
P3 = {y,z}

A3->A2: P2
A1->A4: P1
A1->A5: P3