Isomorphic Insertation with Raku

by Arne Sommer

Isomorphic Insertation with Raku

[108] Published 27. December 2020.

This is my response to the Perl Weekly Challenge #092.

Challenge #092.1: Isomorphic Strings

You are given two strings $A and $B.

Write a script to check if the given strings are Isomorphic. Print 1 if they are otherwise 0.

Example 1:
Input: $A = "abc"; $B = "xyz"
Output: 1
Example 2:
Input: $A = "abb"; $B = "xyy"
Output: 1
Example 3:
Input: $A = "sum"; $B = "add"
Output: 0

The point is a one-to-one mapping. So if we set up a hash, with the characters from $A as keys, and the characters from $B as the values. Duplicate keys (from $A) are ok, as long as they lead to the same value (from $B); e.g. $A = 'ABA'; $B = '1*1' is ok. Duplicates on the value side (from $B) are ok in a hash, but not for us here. We can get rid of them like this: %hash.values.unique.

File: isomorphic-strings-zip
#! /usr/bin/env raku

unit sub MAIN (Str $A, Str $B, :v($verbose));    # [1]

if $A.chars != $B.chars                          # [2]
{
  say ": Different length" if $verbose;
  say 0;
  exit;
}

my %A2B = ($A.comb Z $B.comb)>>.hash;            # [3]

say ": Hash: { %A2B.raku }" if $verbose;

say %A2B.keys.elems == %A2B.values.unique.elems  # [4]
  ?? 1                                           # [4a]
  !! 0;                                          # [4b]

[1] The Str type does not actually restrict the input, so we could drop it.

[2] Check that the two strings have the same length. If not, say so and exit.

[3] The Z infix operator (which has a prefix version zip) takes one element from each list at a time and gives a list back (as a zipper) with the tuples as sublists. We want a hash, and a two-dimentional list cannot be coerced to one, so we have to coerce each pair of values to a hash before the assignment to the hash to make this work.

[4] Ensure a one-to-one mapping, and print 1 on success [4a] and 0 if not [4b].

See docs.raku.org/routine/Z for more information about Z.

See docs.raku.org/routine/zip for more information about zip.

The reason Z resturns a list of lists (and not a list of pair objects, that we could have turned into a hash with a simple assignment) is that is is a spceial case of of the zip function, that can take two or more lists. Three elements do not easily fit in Pair objects...

See docs.raku.org/type/Pair for more information about the Pair type.

Running it:

$ ./isomorphic-strings-zip abc xyz
1

$ ./isomorphic-strings-zip -v abc xyz
: Hash: {:a("x"), :b("y"), :c("z")}
1

$ ./isomorphic-strings-zip abb xyy
1

$ ./isomorphic-strings-zip -v abb xyy
: Hash: {:a("x"), :b("y")}
1

$ ./isomorphic-strings-zip sum add

0
$ ./isomorphic-strings-zip -v sum add
: Hash: {:m("d"), :s("a"), :u("d")}
0

So far so good...

From Zip to Map

We can use a traditional loop instead of the handy zip operator(s). Here it is with an implicit loop with map insted if the [3]-line above:

my %A2B = (^$A.chars).map({ ( $A.substr($_,1) => $B.substr($_, 1) ) });

The zip version is shorter, and easier to understand (in my opinion).

The last example had a duplicate value («d»), and the program detected it. But what happens if we change the order of the strings?

$ ./isomorphic-strings-zip -v add sum
: Hash: {:a("s"), :d("m")}
1

Oops.

The problem is that dupliate keys are lost in a hash.

We can in fact use a hash, but must build it up manually:

File: isomorphic-strings-zip-again
#! /usr/bin/env raku

unit sub MAIN (Str $A, Str $B, :v($verbose));

if $A.chars != $B.chars
{
  say ": Different length" if $verbose;
  say 0;
  exit;
}

my @A2B = ($A.comb Z $B.comb);

say ": Array: { @A2B.raku }" if $verbose;

my %A;

for @A2B -> @pair
{
  say ": Pair: @pair[0] -> @pair[1]" if $verbose;

  if %A{@pair[0]}.defined                               # [1]
  {
    if %A{@pair[0]} eq @pair[1]                         # [2]
    {
      say ": Duplicate of @pair[0] (value: @pair[1])";
      next;
    }
                                                        # [3]
    say ": Redeclaration of @pair[0] (values: %A{@pair[0]} and @pair[1])";
    say 0;
    exit;
  }
  %A{@pair[0]} = @pair[1];
}

say %A.keys.elems == %A.values.unique.elems             # [4]
  ?? 1
  !! 0;

[1] This one handles duplicate value in $A.

[2] Does if have the same value in $B, if so we are good.

[3] If not, we say so (say 0) and are done.

[4] This one handles duplicate values in $B.

Running it:

$ ./isomorphic-strings-zip-again -v abc xyz
: Array: [("a", "x"), ("b", "y"), ("c", "z")]
: Pair: a -> x
: Pair: b -> y
: Pair: c -> z
1

$ ./isomorphic-strings-zip-again -v abb xyy
: Array: [("a", "x"), ("b", "y"), ("b", "y")]
: Pair: a -> x
: Pair: b -> y
: Pair: b -> y
: Duplicate of b (value: y)
1

$ ./isomorphic-strings-zip-again -v sum add
: Array: [("s", "a"), ("u", "d"), ("m", "d")]
: Pair: s -> a
: Pair: u -> d
: Pair: m -> d
0

The duplicate detection is done by the last line it the program, and it does not have verbose output (as it probably should). I'll get back to that later..

It we swap the strings, we get the verbose explanation as to why it fails:

$ ./isomorphic-strings-zip-again -v add sum
: Array: [("a", "s"), ("d", "u"), ("d", "m")]
: Pair: a -> s
: Pair: d -> u
: Pair: d -> m
: Redeclaration of d (values: u and m)
0

We can get rid of the test for string length with a little care (and the use of roundrobin):

File: isomorphic-strings-roundrobin
#! /usr/bin/env raku

unit sub MAIN (Str $A, Str $B, :v($verbose));

my @A2B = (roundrobin($A.comb, $B.comb));  # [1]

say ": Array: { @A2B.raku }" if $verbose;

my %A;

for @A2B -> @pair
{
  unless @pair[1].defined        # [2]
  {
     say ": Different length" if $verbose;
     say 0;
     exit;
  }

  say ": Pair: @pair[0] -> @pair[1]" if $verbose;

  if %A{@pair[0]}.defined        # [3]
  {
    if %A{@pair[0]} eq @pair[1]  # [3a]
    {
      say ": Duplicate of @pair[0] (value: @pair[1])";
      next;
    }
                                 # [3b]
    say ": Redeclaration of @pair[0] (values: %A{@pair[0]} and @pair[1])";
    say 0;
    exit;
  }
  %A{@pair[0]} = @pair[1];
}

if %A.keys.elems == %A.values.unique.elems
{
  say 1;
}
else
{
  say ': Redeclaration of value (in $B).';  # [4]
  say 0;
}

[1] Using roundrobin instead of the infix Z operator gives trailing single elemens when the input lists have diffrent length (whereas Z (and zip) gives up when the shortest list has been exhausted. Coercing the sublists to hashes will fail if they contain single values, as we get if one of thestrings is shorter than the other one.

[2] This block handles the situation where the strimgs have different lenght. This approach has just as much code as the previous program, so the change probably isn't worth the effort.

[3] Do have duplicate keys? If they have the same value, that is ok [3a]. If not, say so and exit [3b].

[4] Duplicate values (from $B) that came from different keys (from $A).

See docs.raku.org/routine/roundrobin for more information about roundrobin.

Running it gives the same output as above, except when we have duplicate values in $B (where we now get verbose output):

$ ./isomorphic-strings-roundrobin -v sum add
: Array: [("s", "a"), ("u", "d"), ("m", "d")]
: Pair: s -> a
: Pair: u -> d
: Pair: m -> d
: Redeclaration of value (in $B).
0

A Perl Version

This is straight forward translation of «isomorphic-strings-zip2», except that I have used a loop on substrings instead of the «Z» (zip) operator. The last block has the verbose output from «isomorphic-strings-roundrobin»:

File: isomorphic-strings-perl
#! /usr/bin/env perl

use strict;
use feature 'say';
use List::Util 'uniq';
use Getopt::Long;

my $verbose = 0;

GetOptions("verbose"  => \$verbose);

my $A = shift(@ARGV) // die 'Please specify $A';
my $B = shift(@ARGV) // die 'Please specify $B';

if (length($A) != length($B))
{
  say ": Different length" if $verbose;
  say 0;
  exit;
}

my %A;

for my $index (0 .. length $A -1)
{
  my $pair0 = substr($A, $index, 1);
  my $pair1 = substr($B, $index, 1);
    
  say ": Pair: $pair0 -> $pair1" if $verbose;

  if (defined %A{$pair0})
  {
    if ($A{$pair0} eq $pair1)
    {
      say ": Duplicate of $pair0 (value: $pair1)" if $verbose;
      next;
    }
    
    say ": Redeclaration of $pair0 (values: " . $A{$pair0} . " and "
      . $pair1 . ")" if $verbose;
    say 0;
    exit;  
  }
      
  $A{$pair0} = $pair1;
}

if (scalar keys %A == scalar uniq values %A)
{
  say 1;
}
else
{
  say ': Redeclaration of value (in $B).' if $verbose;
  say 0;
}

Running it gives the same result as the final Raku version («isomorphic-strings-roundrobin»):

$ ./isomorphic-strings-perl abc xyz
1

$ ./isomorphic-strings-perl abb xyy
1

$ ./isomorphic-strings-perl sum add
0

$ ./isomorphic-strings-perl add sum 
0

Challenge #092.2: Insert Interval

You are given a set of sorted non-overlapping intervals and a new interval.

Write a script to merge the new interval to the given set of intervals.

Example 1:
Input $S = (1,4), (8,10); $N = (2,6)
Output: (1,6), (8,10)
Example 2:
Input $S = (1,2), (3,7), (8,10); $N = (5,8)
Output: (1,2), (3,10)
Example 3:
Input $S = (1,5), (7,9); $N = (10,11)
Output: (1,5), (7,9), (10,11)

I have chosen to set up a custom class for the intervals, so that I could use class methods for checking if two intervals are mergable - and then merge them.

File: insert-interval
#! /usr/bin/env raku

unit sub MAIN (:$S = "(1,4),(8,10)", :$N ="(2,6)", :v(:$verbose));  # [1]

my $s = $S.EVAL;                   # [1a]
my $n = $N.EVAL;                   # [1a]

class interval                     # [2]
{
  has $.start;                     # [2a]
  has $.stop;                      # [2b]

  method inside ($value)           # [3]
  {
    return $.start <= $value <= $.stop;
  }

  method mergable (interval $new)  # [4]
  {
    return True if $.start -1 < $new.start < $.stop +1;
    return True if $.start -1 < $new.stop  < $.stop +1;
    return False;	
  }
					     
  method merge (interval $new)     # [5]
  {
    return interval.new(start => min($.start, $new.start),
                        stop  => max($.stop,  $new.stop));
  }

  method Str                       # [6]
  {
    return "({ $.start },{ $.stop })";
  }
}

my @all = @$s.map({ interval.new(start => $_[0], stop => $_[1]) }); # [7]

my $m = interval.new(start => $n[0], stop => $n[1]);                # [8]

@all.push($m);                                                      # [9]

@all = @all.sort: { $^a.start <=> $^b.start ||  $^a.stop <=> $^b.stop };
                                            # [10] 
       
my @result;                                 # [11]

my $first = @all.shift;                     # [12]

loop                                        # [13]
{
  last unless @all.elems;                   # [14]

  my $second = @all.shift;                  # [15]
  if $first.mergable($second)               # [16]
  {
    my $new = $first.merge($second);        # [17]
    $first = $new;                          # [17a]
    next;                                   # [17b]
  }
  elsif $first.stop < $second.start + 1     # [18]
  {
    @result.push($first);                   # [18a]
    $first = $second;                       # [18b]
    next;                                   # [18c]
  }
  else                                      # [19]
  {
    @result.append($first, $second, @all);  # [19a]
    $first = Any;                           # [19b]
    last;                                   # [19c]
  }
}

@result.push($first) if $first;             # [20]

say @result.join(", ");                     # [21]

[1] Specifying a data structure on the command line is difficult, but a string (the quotes are there to prevent the shell from playing havoc with the parens) in combination with EVAL [1a] does the trick. (Note that EVAL does not run code when used like this, and trying to do so will cause an error.)

[2] The «interval» class, with the two class variables for the start and stop values.

[3] A method that tells us if a given value is inside the interval. (It is not used, but here it is anyway.)

[4] A method that tells us if two intervals can be merged (are overlapping).

[5] A method that merges two intervals. Do check if they are mergable first.

[6] Stringification of the objects, used by [21].

[7] Get a list of «interval» objects from the input.

[8] Ditto for the one interval to add,

[9] and add it to the list.

[10] Sort the list, with the lowest start value first. If several, get the one with the highest stop value first.

[11] We are going to store the result (as a sorted list of «interval» objects here.

[12] Get the first element.

[13] An eternal loop,

[14] • until we have emptied the array.

[15] Get the next element.

[16] Can we merge the two elements?

[17] • if so, merge them.

[18] Is the first one fully before the secons one? If so add it to the list [18a], and prepare for the nect iteration [18a,b,c].

[19] Otherwise (we are done, as they are not overlapping), add them both to the list [19a] and we are done.

[20] Add the current interval, if any.

[21] Stringify the objects (see [6]) and print them.

See docs.raku.org/routine/EVAL for more information about EVAL.

Running it:

$ ./insert-interval -S="(1,4), (8,10)" -N="(2,6)"
(1,6), (8,10)

$ ./insert-interval -S="(1,2), (3,7), (8,10)" -N="(5,8)"
(1,2), (3,10)

$ ./insert-interval -S="(1,5), (7,9)" -N="(10,11)"
(1,5), (7,9), (10,11)

We got the same result as the page linked to in the challenge.

It can be argued that adjacent integer intervals, as we have in the second and third examples, should be merged. The challenge does not, so I'll support it with a command line option «--integer» - where we simply change the offsets used in «mergeable»:

File: insert-interval2 (partial)
unit sub MAIN (:$S = "(1,4),(8,10)", :$N ="(2,6)", :i($integer), :v(:$verbose));
my $limit = $integer ?? 2 !! 1;
  method mergable (interval $new)
  {
    return True if $.start -$limit < $new.start < $.stop +$limit;
    return True if $.start -$limit < $new.stop  < $.stop +$limit;
    return False;	
  }

Running it:

$ ./insert-interval2 -S="(1,4), (8,10)" -N="(2,6)"
(1,6), (8,10)

$ ./insert-interval2 -S="(1,4), (8,10)" -N="(2,6)" -i
(1,6), (8,10)

$ ./insert-interval2 -S="(1,2), (3,7), (8,10)" -N="(5,8)"
(1,2), (3,10)

$ ./insert-interval2 -S="(1,2), (3,7), (8,10)" -N="(5,8)" -i
(1,10)

$ ./insert-interval2 -S="(1,5), (7,9)" -N="(10,11)" -i
(1,5), (7,11)

$ ./insert-interval2 -S="(1,5), (7,9)" -N="(10,11)"
(1,5), (7,9), (10,11)

Verbose mode is available as a command line option, but is missing from the program due to lack of time to implement it fully.

Perl

I did not have time for a perl version of this program.

And that's it.