Ben's Journal: January 2022

Friday, January 28, 2022

The $12 Kids Toy That's Also A Tech Game Changer

Looking back, my favorite Tech purchase from 2021 also happens to be my favorite Get Things Done purchase. It's the JONZOO LCD Writing Tablet.

I know what you're thinking: isn't that a $12.00 kids toy off Amazon? Yes, yes it is. And it's both awesome and a game changer.

The 'Jonzoo' is a knock off of the Boogie Board which I first noticed back in 2010. It's essentially an electronic version of the writing slate that was a staple in many classrooms back in the day.

The Jonzoo does one thing, and it does it well: it turns wherever you press the plastic stylus into a dimly lit color. It's perfect for keeping your 3 year old occupied without risking her going to town on your walls with crayon. It's also perfect for someone like myself who likes to problem solve through sketching and lists.

Here's a few reasons why I love this device:

It's sensitive: it's easy and natural to write on.
It's not too sensitive: wresting my palm on the screen or otherwise touching it doesn't leave marks.
It's lightweight: it's plastic toy, but it doesn't feel overly delicate or flimsy.
It's cheap: at $12.00 each I've now purchased three of them, one for each floor of our house. I always have one within reach.
It's reliable: there's a switch to avoid accidentally clearing your work. When engaged the device is write-only.
It's functional: misplace the plastic stylus that comes with the device? No worries, any pointy stick thing will work in its place. I've used everything from a toothpick to a clicky-pen with the tip retracted to scratch out notes.

As for drawbacks, it's hard to find any. Sure, the writing is a bit dim: but what do you want for a device that uses almost no power? And there's no ability to partially erase the screen, but having a write-only device adds integrity, right?

You might think the obvious drawback is that there's no save function. But that's just demonstrating a lack of imagination. I use my cell phone's camera to effortlessly capture the screen (see above) resulting in an archived copy of whatever I've written. This copy is safely stored in the cloud and available on all my devices

Here's how it works in practice: I'll often sketch out my plan for the day, snap a pic, and throughout the day refer back to the plan by either checking my phone's gallery or visiting photos.google.com on my desktop.

During meetings, I'll take this basic practice up a notch. Once the board is filled, I'll quickly snap a picture, clear the screen and keep writing. After the meeting, I'll store all the photos in an album, which I can share with others or link to from Trello, Jira or some other task manager.

A nice bonus is that Google's object recognition tech is smart enough to classifying these pics. A simple search for 'Blackboard' on Google Photos brings up the previously stored pictures for easy review.

So often the bold promises of cheap tech fall flat. But in this case, the device really delivers. I'm not giving up the notepad I carry in my back pocket any time soon, but the ability to have a zero-maintenance, markers-never-run-out whiteboard at my fingertips is just too cool. Sorry quad-ruled composition notebooks, it was a solid 9 year run, but you've been replaced.

Tuesday, January 25, 2022

Fighting Finicky Frames; Perfectly Printing Pics Programmatically

A few months ago, Shira picked up a couple of these photo-cubes with the plans to turn one into a gift:

Each side of the cube holds a 2.5x2.5 inch photo with only 2x2 inches visible. The challenge became: how can I prepare a series of pictures such that when they are printed, they will output as 2x2 inch photos on 4x6 inch paper.

Let's Do Some Math

Here was my approach: crop the pictures that Shira liked into squares. For the sake of an example, assume that each of the pictures are 1382x1382 pixels. Using a little algebra, I solved for the pixel width and height of the completed 4x6 photo:

# Derive the width
2 inches         6 inches
--------     =  ----------
1382 pixels      w pixels

2w = 6*1382
w = (6*1382) / 2
w = 4146


# Derive the height
2 inches         4 inches
--------     =  ----------
1382 pixels      h pixels

2h = 4*1382
h = (4*1382) / 2
h = 2764

Within Gimp, I resized the canvas of each of the photos to the calculated dimensions (4146x2764). We then sent off the pics to be printed and cut away the whitespace until the images fit the fame. The final product looked great.

Since then, I've been staring at the backup photo cube Shira bought and thinking: I really need to automate this process and try printing the photos again.

Along the way, while visiting the kids in Florida, G showed me a locket that she'd 'borrowed' from her Mom. It was pretty, but was missing the photo inside. I took pics of its dimensions and vowed I'd get her a photo that would fit. I realized, this was just another version of the photo-cube problem:

Let's Code This

I pondered a few different solutions to this image manipulation problem and arrived at a simple approach. My plan was to write a shell script that used ImageMagick to do the following:

Calculate the requested aspect ratio of both the final size (2x2 inches) and the provided photo (1382x1382), and confirm that they match.
Calculate the conversion factor: 1 inch = x pixels based on the pixel size of the input photo and the requested output size.
Use ImageMagick's convert tool's -border option to pad the image with extra space.

For example, here's the command I used to convert a directory full of square photos for output as 2x2 inch photos printed on 4x6 inch paper. The -b option added a .25 inch border around the photo, which makes it easier to trim. This .25 inch border means that each photo is now 2.5x2.5 inches, the frame's advertised size.

$ for f in inputs/*.jpg; do echo $f ; picassist -a prepare-print -W 6 -H 4 -w 2 -h 2 -b .25 -f $f -o outputs/$(basename $f) ; done
inputs/20190715_130759.jpg
inputs/20190716_073107.jpg
inputs/20210802_071223.jpg
inputs/20210803_102944.jpg
inputs/20210805_191843.jpg
inputs/f_img_4832.jpg

And here's the same command, but now I'm preparing the images for printing on 8.5x11 inch printer paper at home. Note that I'm still generated 2x2 inch photos with a .5 inch border.

for f in scaled/*.jpg; do echo $f ; picassist -a prepare-print -W 8.5 -H 11 -w 2 -h 2 -b .25 -f $f -o 8.5x11/$(basename $f) ; done
scaled/20190715_130759.jpg
scaled/20190716_073107.jpg
scaled/20210802_071223.jpg
scaled/20210803_102944.jpg
scaled/20210805_191843.jpg
scaled/f_img_4832.jpg

I used the same strategy for making prints for the locket. As far I can tell, the dimensions of the locket are 0.6875x0.9375 inches. I added a 1/16" border around them:

 picassist -a prepare-print -W 4 -H 6 -w 0.6875 -h 0.9375 -b 0.0625 -f scaled.jpg -o output.jpg

Picture Perfect. Almost.

I sent a series of photos to CVS to be printed. When I picked them up a few hours later I was disappointed to see that the images were about 2/16" off. Something must be getting scaled unexpectedly in the process. I printed the photos on my laser printer, and found similar results. Both the gray border and picture are a couple of 16ths off.

Undeterred, I went to work trimming the photos and sliding them into the cube. Even with the dimensions being slightly off, the photos do look good:

I'm going to call this a success. I was able to prepare the photos, send them off to a printer and trim them with minimal effort. Wacky frame sizes, I no longer fear you!

Below is the script that does the image tweaking. Enjoy!

#!/bin/bash

##
## Do useful stuff with photos
##

usage() {
  me=$(basename $0)
  echo "Usage: $me -a prepare-print -W outer-width -H outer-height -w inner-width -h inner-height -b border-width -f img.jpg -o out.jpg"
  echo ""
  echo "Ex: $me -a prepare-print -W 6 -H 4 -w 1.5 -h 1.5 -b .2 -f img.jpg -o final.jpg"
  echo "  (print a 1.5x1.5 inch photo on a 4x6 in print)"
  exit 1
}

action=""

while getopts "a:W:H:w:h:b:f:o:" o; do
  case "$o" in
    a) action=$OPTARG ;;
    W) outer_width=$OPTARG ;;
    H) outer_height=$OPTARG ;;
    w) inner_width=$OPTARG ;;
    h) inner_height=$OPTARG ;;
    b) border_width=$OPTARG ;;
    f) file=$OPTARG ;;
    o) output=$OPTARG ;;
    *|h) usage ;;
  esac
done

calc() {
  echo "scale=3 ; $* " | bc -l
}

im_id() {
  identify "$@"
}

case "$action" in
  prepare-print)
    if [ -z "$outer_width" -o -z "$outer_height" -o -z "$inner_width" -o -z "$inner_height" -o -z "$border_width" ] ; then
      echo "Missing W, H, w, h or b."
      exit 1
    fi

    if [ -z "$file" ] ; then
      echo "Missing -f file"
      exit 2
    fi

    if [ ! -f "$file" ] ; then
      echo "File [$file] doesn't exist"
      exit 3
    fi

    if [ -z "$output" ] ; then
      echo "No output file set."
      exit 4
    fi
    
    real_width=$(im_id -format '%w' $file)
    real_height=$(im_id -format '%h' $file)
    
    required_ratio=$(calc "$inner_width / $inner_height")
    real_ratio=$(calc "$real_width / $real_height")

    if [ "$required_ratio" != "$real_ratio"  ] ; then
      echo -n "Ratio mismatch: require ${inner_width}x${inner_height} (${required_ratio}) != "
      echo    "${real_width}x${real_height} (${real_ratio})"
      exit
    fi

    px_per_real=$(calc "$real_width / $inner_width")

    border_px=$(calc "$border_width * $px_per_real")

    h_margin=$(calc "(($outer_width - $inner_width - ( $border_width * 2)) * $px_per_real) / 2")
    v_margin=$(calc "(($outer_height - $inner_height - ( $border_width * 2)) * $px_per_real) / 2")

    convert  $file \
             -bordercolor gray -border $border_px \
             -bordercolor white -border ${h_margin}x${v_margin} \
             $output
    ;;

  *) usage ;; 
esac

Friday, January 21, 2022

Hidden in Plain Sight: The Butt Millet Memorial Fountain

A few weeks back, Shira and I were walking through DC and found ourselves perusing the National Christmas Tree display. Nearby, I saw the familiar Zero Milestone. I also noticed a tired looking fountain-statue-thing that I'd never really taken note of before.

A bit of research confirmed that I'd found the Butt Millet Memorial Fountain. While it may be obscure, it's worth getting to know.

Here's 4 fascinating facts about this easy-to-miss memorial.

1. It's a memorial that honors an early DC Bromance. The memorial honors two men, Archibald Butt and Francis Millet. Butt was a well known soldier; Millet an accomplished artist. While Millet was married, his wife lived out of town and Butt and Millet lived together, having a reputation for throwing lavish parties. Were they more than best buds? Even the National Park Service takes the time to note that while Millet was married he had several same-sex relationships in his lifetime.

2. Butt was a close friend to both Presidents Roosevelt and Taft. Notably, Taft considered him like a brother. Taft was so moved at his funeral that he broke down during Butt's eulogy and had to be led from the podium in tears.

3. Butt and Millet died together during the sinking of the Titanic, and the memorial makes a nod to this. These weren't merely two anonymous passenger who happened to be lost during a well known tragedy; their loss was front page news.

A day after the sinking of the Titanic, it was wrongly reported that Millet and Butt were both safe. Within 10 day of the sinking, there were already legends being published that Butt courageously helped with the rescue. While these legends have never been confirmed, it didn't stop the creator of the memorial fountain from incorporating them:

The central shaft [of the fountain] will reach a height of twelve feet. It is of classic design. Upon one fact it will bear and armed female figure, in bas-relief, representing Chivalry, having reference to Maj. Butt's aid to women and children on the occasion of the disaster in which he met his death; on the opposite face will be a similar figure representing Art, having special reference to Mr. Millet.

4. The memorial may (but probably didn't) have had a functional purpose. When I read this description of the Butt-Millet Fountain, I thought I'd connected some important dots:

On the south, a man with a helmet, sword, and shield represents military valor in honor of Butt. The fountain was designed to be used as drinking water for horses of the U.S. Park Police, but don’t imagine it is still used this way.

The idea that the memorial doubled as an equine drinking fountain is both clever and may explain its shape. And yet, I can't find any evidence to suggest this claim is true. Sure, the bowl of the fountain looks to be an at an ideal height a horse to take a drink; and it may have served that purpose on occasion. But, the description of the fountain when it was built neglects to mention this use. As late as 1934 there are descriptions of Washington, DC having purpose built horse drinking fountains. If such fountains existed, why make that part of a memorial?

In retrospect, the idea that you'd combine a memorial for two of DC's well known citizens who were tragically lost in a epic disaster with a place for horses to catch a drink seems pretty ridiculous.

Still, given the shape of the fountain, I can see how an urban legend like this would thrive.

Wednesday, January 19, 2022

DC's National WW I Memorial, Incomplete and Worth A Visit

A few weeks ago Shira and I were traipsing through DC when we stumbled upon the National World War I memorial. This was surprising as I had no idea DC had a such a memorial. Apparently, until recently, we didn't. The location is a sort of expansion of Pershing Park, which has been in place since the 1980's and honors General Pershing and the 2 million(!) members of the American Expeditionary Force he lead during World War I.

Eventually, I'd find myself standing in front a plaque that would explain the in-progress nature of the memorial. But before I did, I found myself face to face with A Soldier's Journey, a massive mural that attempts to capture the soldier's experience during World War I.

I was taken aback by the choice of using a rough sketch to capture the scene. But at the same time, it worked. As I stared at the mural in front me I struggled to square the simplicity of materials and techniques with the awesome emotive power that it brought to bear.

As I looked closer, I realized that this wasn't just a rough sketch. It was a rough sketch done on a tarp like material that was then bolted into place.

What?

A plaque at an elevated position in the memorial cleared up the confusion. A Soldier's Journey is going to be bronze statue; the rough sketch is a temporary place holder.

I'm sure the completed bronze relief will be impressive. But I'd recommend stopping by while the sketch is in place. I'm telling you, there's something magical about watching something so simple say so much.

Tuesday, January 18, 2022

Making a Good Thing Better: A Forth Unit Testing Framework

Regardless of the programming language you're coding in, unit testing is an obvious best practice. However, given Forth's anything goes philosophy, testing goes from nice to have to the only sane way to code without losing your mind.

Testing Without a Framework

The interactive nature of Forth, the ease of word definition and the handy assert word in Gforth provides most of what I needed for testing my code.

Consider these tests for coin.fs, a module that simulates coin flips.

: test-coin-basics ( -- )
    assert( heads heads? )
    assert( tails tails? )
    assert( heads tails? false  = )
    assert( heads coin? )
    assert( tails coin? )
    assert( 100 coin? false = )
    assert( flip coin? )
;

0 value #heads      0 value #tails
: close-enough? ( x y -- b )
    - abs 30 < ;

: test-coin-flip-test ( -- )
    randomize
    0 to #heads
    0 to #tails
    assert( #heads 0 = )
    assert( #tails 0 = )

    100 0 +do
        flip
        heads? if 1 0 else 0 1 endif
        #tails + to #tails
        #heads + to #heads
    loop

    assert( #heads  #tails close-enough? )
;

\ Run the tests!
test-coin-basics test-coin-flip-test

By executing require on this test file, the tests are not only defined but executed. If there's a failure in the test, the system will alert you via a failed assertion. Otherwise, the tests will quietly succeed.

This approach works well, but there are a few minor details that nag at me. I don't love that I have to duplicate the names of the tests: first to create them, and then to execute them. And I don't love that I'm defining test words in the global scope which may be unnecessary for the rest of the application, or may conflict with other tests defined later.

The latter problem I could solve by defining the tests as private words in a module. But the issue of having to name tests, and then repeat that name below still stands.

What I wanted was to borrow a page out of my lightweight PHP unit test framework. A central idea there is that tests are not named; they're simply functions added to a list that can be executed later by a test runner.

Somewhat surprisingly, Gforth makes it almost trivial to implement this model.

Testing With a Framework

Here's how the above tests look when they're taking advantage of my lightweight testing framework:

\ test out coins.fs

:test
    assert( heads heads? )
    assert( tails tails? )
    assert( heads tails? false  = )
    assert( heads coin? )
    assert( tails coin? )
    assert( 100 coin? false = )
    assert( flip coin? )
;

0 value #heads     0 value #tails
: close-enough? ( x y -- b )
    - abs 30 < ;

:test
    randomize
    0 to #heads
    0 to #tails
    assert( #heads 0 = )
    assert( #tails 0 = )

    100 0 +do
        flip
        heads? if 1 0 else 0 1 endif
        #tails + to #tails
        #heads + to #heads
    loop

    assert( #heads  #tails close-enough? )
;

The defining word :test^* creates a new, anonymous test which will be executed when run-tests is invoked. Not only did I avoid having to duplicate the name of the test to execute, I didn't have to name the test in the first place. Whoo!

shuffler.fs uses the functionality in coins.fs, and so it therefore requires its tests as well.

\ Import libraries
require lib/modules.fs
require lib/utils.fs
require lib/arrays.fs
require lib/strings.fs
require lib/random.fs
require lib/testing.fs
require lib/coins.fs   \ <<< The Code
require lib/cards.fs
require lib/decks.fs
require lib/shuffle.fs  

\ Import tests
require tests/utils.fs 
require tests/modules.fs
require tests/cards.fs
require tests/coins.fs \ <<< The Tests
require tests/decks.fs
require tests/shuffle.fs 
require tests/random.fs

\ Run the tests
cr run-tests cr cr

\ Top level code that uses the libraries
new-deck shuffle .deck cr
new-deck 7*shuffle .deck cr cr

When I execute shuffler.fs I see the following messages:

Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
s" /home/ben/dt/i2x/code/src/master/forth/shuffler.fs" included redefined .card  
0 Failures, 13 Tests
...

The first four lines of output are typical for Gforth. The 0 Failures, 13 Tests message indicates that my test all ran and there were no failures.

This approach bakes running the tests into every execution of my Forth project, and does so in a streamlined way.

If I want to be verbose, it's possible to ask the test framework for the status of every test that was run:

tests. 0  OK /home/ben/dt/i2x/code/src/master/forth/tests/utils.fs:5 
1  OK /home/ben/dt/i2x/code/src/master/forth/tests/utils.fs:12 
2  OK /home/ben/dt/i2x/code/src/master/forth/tests/utils.fs:22 
3  OK /home/ben/dt/i2x/code/src/master/forth/tests/utils.fs:33 
4  OK /home/ben/dt/i2x/code/src/master/forth/tests/modules.fs:20 
5  OK /home/ben/dt/i2x/code/src/master/forth/tests/modules.fs:28 
6  OK /home/ben/dt/i2x/code/src/master/forth/tests/cards.fs:3 
7  OK /home/ben/dt/i2x/code/src/master/forth/tests/coins.fs:3 
8  OK /home/ben/dt/i2x/code/src/master/forth/tests/coins.fs:20 
9  OK /home/ben/dt/i2x/code/src/master/forth/tests/decks.fs:3 
10  OK /home/ben/dt/i2x/code/src/master/forth/tests/decks.fs:13 
11  OK /home/ben/dt/i2x/code/src/master/forth/tests/shuffle.fs:6 
12  OK /home/ben/dt/i2x/code/src/master/forth/tests/random.fs:3 
 ok

The number in the first column can be used to re-execute a test, should I want to do so interactively:

8 run-test
8 run-test OK ok

These extra capabilities are a nice bonus, but ultimately, it's that one terse line telling me that all the tests passed that really saves the day. As soon as I see a reported failure, I stop what I'm doing and focus on fixing that issue.

A Surprisingly Simple Implementation

Building this unit test framework mirrored the development of my Forth module system. I struggled with a number of false starts, and when I finally figured out an ideal approach, the code came together effortlessly.

The :test defining word is a core part of the system, and it's delightfully short:

: :test ( -- )
  noname : latestxt register-test ;

:test makes use of Gforth's noname word, which has the following characteristics:

The next defined word will be anonymous. The defining word will leave the input stream alone. The xt of the defined word will be given by latestxt.

The anonymous function's execution token, created by noname is handed to register-test which stores it away for later use.

run-tests loops through the registered execution tokens and calls catch on them. catch leaves a value on the stack as to whether the word executed without raising an exception. This value is stored in outcomes, which can be inspected later.

: run-tests ( -- )
    #tests 0 +do
        i tests @ catch i outcomes !
    loop .summary ;

.summary prints out a summary of the return codes in the outcomes array.

You can find the unit testing framework's source code here. I love that I've simplified test definition, and that test execution becomes a seamless part of app development. It's also bonus nice that I've simplified the process of coding in Forth without losing my mind.

*I'm still not sure if the best convention is: :test or test:. Oh well; naming.

Wednesday, January 12, 2022

Free and Fast, A Programmer Friendly Source for Historic News Data

To power past projects, I've looked around for sources of historic news data that I could query with ease. The APIs I found required subscription fees that I couldn't justify.

The other day, I realized a free and accessible source for structured news data may be readily at hand. After a few minutes of poking around, I had my first version of headlines, a shell script that pulls back headlines given a date.

Let's Mine The News

Here's the script in action. I'm showing 5 headlines from 3 random days within the last 3,000 days. Full disclosure: I ran this process a few times until I got 3 dates that were relatively far apart.

$ for i in 1 2 3 ; do x=$(($RANDOM % 3000)) ; echo "$x days ago" ; headlines -d "$x days ago"  | head -5  ; done
350 days ago
Tue, 26 Jan 2021 22:40:08 GMT|President Biden announces the purchase of enough doses to fully vaccinate Americans by summer's end
Tue, 26 Jan 2021 22:22:41 GMT|Watch Biden's vaccine announcement
Tue, 26 Jan 2021 20:12:12 GMT|White people are getting vaccinated at higher rates
Sat, 23 Jan 2021 01:35:21 GMT|See expert's plan to end pandemic in four weeks
Tue, 26 Jan 2021 12:34:44 GMT|The global scramble for vaccines is getting ugly
2753 days ago
Sun, 29 Jun 2014 19:52:16 EDT|Gay couple's 40-year immigration battle
Fri, 27 Jun 2014 06:04:45 EDT|'Heavy drinker' definition surprises
Sun, 29 Jun 2014 07:14:00 EDT|NASA tests saucer for Mars mission
Sat, 28 Jun 2014 20:18:19 EDT|Routine traffic stop turns physical
Sun, 29 Jun 2014 14:15:01 EDT|90 rolls of duct tape made THIS
1461 days ago
Thu, 11 Jan 2018 22:42:45 GMT|President reportedly suggests US get more people from countries like Norway
Thu, 11 Jan 2018 22:51:43 GMT|Democrats say Trump's remark proves he is racist
Wed, 10 Jan 2018 19:50:37 GMT|White House corrects DACA meeting transcript
Thu, 11 Jan 2018 22:46:06 GMT|Trump rejects bipartisan DACA proposal
Thu, 11 Jan 2018 21:27:49 GMT|Rep. Cuellar: The border wall is a dumb idea

headlines is powered by the Wayback Machine at archive.org. It works because archive.org stores RSS feeds for posterity. The data you're seeing above is from CNN's RSS feed, which archive.org has been diligently capturing nearly every day since January 10th, 2005.

Here's an example of pulling from three different RSS feeds: CNN, New York Times and Fox News. I'm using '1460 days ago,' which was inspired from the random date selection above. Apparently on this day, it was being reported that Trump had casually denegrated Haiti and pretty much all of Africa.

At first it appears that CNN and The New York Times are lit up with the news, while Fox's top story is "Surprising celebrity facts." However, if you look at the dates, you'll see that archive.org didn't have a feed for January 12th, 2018, so it's giving us January 10th. The news about Trump's comments came on 11th, so the fact that Fox isn't covering it yet isn't as meaningful as it may appear. There's also no proof that the RSS feed captured by archive.org represents what people saw on the home page of foxnews.com.

$ for src in cnn_top nyt_top fox_top ; do echo "Source=$src" ; headlines -d "1460 days ago" -s $src | head -5 ; done
Source=cnn_top
Fri, 12 Jan 2018 12:49:56 GMT|Two other GOP senators say they 'don't recall' the President 'saying these comments specifically'
Fri, 12 Jan 2018 19:27:17 GMT|What Trump supporters think of his 'shithole' remark
Fri, 12 Jan 2018 17:22:57 GMT|Analysis: Why no one should believe Trump's 'shithole' denial
Fri, 12 Jan 2018 14:39:27 GMT|Anchor chokes up discussing Trump comment
Fri, 12 Jan 2018 08:24:20 GMT|Late night reacts to Trump's 'shithole' comments
Source=nyt_top
Fri, 12 Jan 2018 23:05:52 GMT|Trump, Haiti, London: Your Friday Evening Briefing
Fri, 12 Jan 2018 19:03:01 GMT|Senator Insists Trump Used ‘Vile and Racist’ Language
Fri, 12 Jan 2018 20:48:11 GMT|News Analysis: A President Who Fans, Rather Than Douses, the Nation’s Racial Fires
Fri, 12 Jan 2018 23:41:18 GMT|‘‘Don’t Feed the Troll’: Much of the World Reacts in Anger at Trump’s Insult
Fri, 12 Jan 2018 22:47:38 GMT|Porn Star Who Claimed Sexual Encounter With Trump Received Hush Money, Wall Street Journal Reports
Source=fox_top
Wed, 10 Jan 2018 10:00:00 GMT|Who knew? Surprising celebrity facts
Wed, 10 Jan 2018 10:00:00 GMT|FOX411's snap of the day
Wed, 10 Jan 2018 03:36:15 GMT|Magnitude 7.6 quake hits in Caribbean north of Honduras
Wed, 10 Jan 2018 03:29:06 GMT|Church: Guam archbishop faces new sexual assault allegation
Wed, 10 Jan 2018 03:22:58 GMT|Australia experiences 3rd hottest year on record in 2017

The above example highlights the limitation of depending on archive.org. There's no guarantee that there will be headline data for every day of the year. Still, it's remarkable how effective headlines is given its simplicity.

How Does It Work?

Pulling news data from archive.org is a two step process. First, the script queries archive.org for the status of the RSS feed in question. For example:

$  curl -s -G \
    --data-urlencode url=http://feeds.foxnews.com/foxnews/latest \
    --data-urlencode timestamp=20180113 https://archive.org/wayback/available | jq .
{
  "url": "http://feeds.foxnews.com/foxnews/latest",
  "archived_snapshots": {
    "closest": {
      "status": "200",
      "available": true,
      "url": "http://web.archive.org/web/20180110041238/http://feeds.foxnews.com/foxnews/latest",
      "timestamp": "20180110041238"
    }
  },
  "timestamp": "20180113"
}

Then, the script takes the 'closest' URL, retrieves that RSS feed and processes it with xmlstarlet to make human readable output.

$ curl -s 'http://web.archive.org/web/20180110041238/http://feeds.foxnews.com/foxnews/latest' | xmllint --format - | head -10
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?>
<?xml-stylesheet type="text/css" media="screen" href="http://feeds.foxnews.com/~d/styles/itemcontent.css"?>
<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>FOX News</title>
    <link>http://www.foxnews.com/</link>
    <description><![CDATA[FOXNews.com - Breaking news and video. Latest Current News: U.S., World, Entertainment, Health, Business, Technology, Politics, Sports.]]></description>
    <image>
      <url>http://tools.foxnews.com/sites/tools.foxnews.com/files/images/fox-news-logo.png</url>

Ultimately, this works as well as it does because archive.org is indexing, and making available to us, a machine readable format. While news organizations never intended to maintain historic snapshots of their feeds, archive.org is glad to do precisely this. It also begs the question: what other programmer friendly data is archive.org storing?

The Complete Script

Here's the most recent version of headlines, which includes support from pulling from a variety of RSS feeds. Happy News Hacking!

#!/bin/bash

##
## Show headlines
##

usage() {
  me=$(basename $0)
  echo "Usage: $me  -t timestamp [-v] [-s source]"
  echo "Usage: $me  -d date-string [-v] [-s source]"
  exit 1
}

source_map() {
  case $1 in
    cnn_top) u='http://rss.cnn.com/rss/cnn_topstories.rss' ;;
    cnn_world) u='http://rss.cnn.com/rss/cnn_world.rss' ;;
    cnn_politics) u='http://rss.cnn.com/rss/cnn_allpolitics.rss' ;;
    cnn_tech) u='http://rss.cnn.com/rss/cnn_tech.rss' ;;
    cnn_business) u='http://rss.cnn.com/rss/money_latest.rss' ;; 
    nyt_business) u='https://rss.nytimes.com/services/xml/rss/nyt/Business.xml' ;;
    nyt_politics) u='https://rss.nytimes.com/services/xml/rss/nyt/Politics.xml' ;;
    nyt_top) u='https://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml' ;;
    fox_top) u='http://feeds.foxnews.com/foxnews/latest' ;;
    fox_politics) u='http://feeds.foxnews.com/foxnews/politics' ;;
    fox_tech) u='http://feeds.foxnews.com/foxnews/scitech' ;;
  esac

  echo $u;
}


source=cnn_top

while getopts "t:d:vus:h" o; do
  case "$o" in
    s) source=$OPTARG ;;
    d) date=$OPTARG ;;
    t) timestamp=$OPTARG  ;;
    u) include_url=yes ;;
    v) verbose=yes ;;
    * | h)
      usage
      ;;
  esac
done


if [ -n "$date" ] ; then
  timestamp=$(date -d "$date" +%Y%m%d)
fi

source_url=$(source_map $source)
if [ -z "$source_url" ] ; then
  echo "$source isn't a valid source"
  exit
fi

if [ -z "$timestamp" ] ; then
  usage
fi

timestamp=$(echo $timestamp | sed 's/[^0-9]//g')

url=$(curl -s -G \
           --data-urlencode url=$source_url \
           --data-urlencode timestamp=$timestamp \
           'https://archive.org/wayback/available' | tee $HOME/.headlines.wb | jq -r .archived_snapshots.closest.url)

if [ -z "$url" ] ; then
  echo "No headlines found for: $timestamp"
  echo "";
  cat $HOME/.headlines.wb
  exit
fi



curl -s "$url" | xmllint --format - | if [ "$verbose" = "yes" ] ; then
  cat
else
  expr="-m '/rss/channel/item' -v pubDate -o '|' "
  if [ "$include_url" = "yes" ] ; then
    expr="$expr -v guid -o '|' "
  fi
  expr="$expr -v title -n"
  eval xmlstarlet sel -t $expr | grep -v '^[|]'
fi

Friday, January 07, 2022

SaSaS #1

We start this first edition of the South Arlington Stick Art Scene with some bad news. Arguably the soul of South Arlington's sticker art scene was the utility poll outside of Bob & Edith's Diner. Over time, this post accumulated a delightful collection of stickers. For those paying attention, passing by was like encountering a public art exhibition.

Alas, some thoughtless soul took a bottle of black spray paint and turned this cooperative art scene into an eye sore. Why? I can't imagine.

In happier news, I was psyched to spot a couple of 'Art Is Dead' stickers while running last week. It's quite possible these were ordered off the web and randomly slapped up to mar public property. However, I'd rather believe this is the work of the same artist who made a statement by covering a bridge in Sherlock stickers. I'd seen the 'Art is Dead' sticker with the text lined up vertically before, but this was the first time I've seen the letters lined up diagonally. I'm interpreting this as the artist continuing to tune his or her work, which I suppose is the complete opposite of the 'Art is Dead' message of the sticker. Well played anonymous artist, well played.

Finally, here's a new favorite sticker I came across a couple of weeks ago. This one isn't located in South Arlington, it's across the bridge in DC.

I know what you're thinking, that's a QR code--big deal. Except, it's not. It looks like a QR code, with the checkered pattern and distinctive black-outlined rectangles in the corner, but if you try to scan the sticker with your phone, no QR code is detected.

Looking at this page which describes how QR codes work, it's still not clear to me what the above image is lacking to be detected as a valid QR code. I can see the Finder Patterns and Timing Patterns clearly enough. But still, it doesn't work.

Granted, the most likely explanation for this 'sticker' was that the poster intended it as a working QR code. The fact that it doesn't scan is almost certainly accidental. It's also possible that some kid intentionally posted this QR-looking code simply to mess with tourists.

However, like the 'Art Is Dead' hypothesis above, I'd like to imagine that this sticker slap is an artist's statement. What looks like urban visual detritus is actually something more. It's not an ad for a company or band; but a hidden-in-plain sight piece of art that you expect to behave one way, but behaves another.

As far as I'm concerned, this humble "QR Code" could easily live in the Hirshhorn Museum, among other clever and subversive pieces of art.

Wednesday, January 05, 2022

Surprisingly Elegant: Implementing Modules in Forth

Recently, I've been experimenting with problem solving in Forth. The experience has been a mix of embracing the familiar (keeping definitions small, focusing on clean abstractions, not underestimating the power of refactoring) as well as stepping outside my comfort zone (learning to think in postfix, embracing loops over recursion, wrapping my head around a global parameter stack). One side effect of Forth's simplicity that's nagged at me was its Everything Is Global philosophy.

What I kept wanting was a simple module system that would let me classify words as being either public or private to a given file.

The fact that Forth doesn't come with a module system is far more feature than bug. Forth, like Scheme, is a language that strives to meet two seemingly contradictory goals: (1) the core language should be as compact as possible. (2) Programmers should be able to build sophisticated abstractions with ease. Building a module system, in addition to supporting the problems I was solving, would be an elegant test of meeting these two principles.

After many false starts, I finally arrived at a solution. Two quick warnings:

I'm a Forth newbie, so this code is almost certainly problematic.
This code runs on gforth and may not run on other Forth implementations.

With those warnings out of the way, let's check out what I built.

Modules In Action

Here's a verbose and somewhat contrived example of a module:

\ Forth module for working with different units of temperature

module

:private scale-up ( x -- x-scaled )
    100 * ;

:private scale-down ( x-scaled -- x )
    50 + 100 / ;

private-words

create unit-symbols
char C c,
char F c,
char K c,

: the-sym ( index -- char )
    unit-symbols + c@ ;

: C ( -- char ) 0 the-sym ;
: F ( -- char ) 1 the-sym ;
: K ( -- char ) 2 the-sym ;

: deg. ( value unit-c -- )
    swap . ." deg " emit ;

public-words

: deg-c ( c -- t )
    scale-up ;

: deg-f ( f -- t )
    scale-up 3200 - 5 9 */ ;

: deg-k ( k -- t )
    scale-up 27315 - ;

: as-deg-f ( t -- f )
    9 5 */ 3200 + scale-down ;

: as-deg-c ( t -- c )
    scale-down ;

: as-deg-k ( t -- k )
    27315 + scale-down ;

: deg-c. ( t -- )
    as-deg-c C deg. ;

: deg-f. ( t -- )
    as-deg-f F deg.  ;

: deg-k. ( t -- )
    as-deg-k K deg. ;

publish

A module starts with the word module and is finalized by the word publish. In between, the programmer can use the words public-words and private-words to delineate sections of code that are public and private.

As the above example shows, it's possible to toggle back and forth between public and private words.

One fun bit of syntatic sugar is the defining word :private. This creates a colon definition but does so in the private word space.

And here's an example of how the module is used:

\ temperature example, useful for demonstrating modules & tests

require lib/modules.fs
require lib/utils.fs
require lib/testing.fs
require lib/temps.fs

require tests/utils.fs
require tests/modules.fs
require tests/temps.fs

run-all

: tab ( -- )
    5 0 u+do space loop ;

10 constant chart-incr

: f-chart. ( low high -- )
    chart-incr + swap u+do
        cr i deg-f
        dup deg-f. tab
        dup deg-c. tab
        deg-k.
    chart-incr +loop cr ;

Other than a few require's, modules are invisible. deg-f. is public, so you can execute it, the word F is private so it's not visible.

For completeness, here's the output of the above code:

Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
s" /home/ben/dt/i2x/code/src/master/forth/temps.fs" included 9 Tests Run, 9 Passed, 0 Failed ok
0 100 f-chart.
0 100 f-chart. 
0 deg F     -18 deg C     255 deg K
10 deg F     -12 deg C     261 deg K
20 deg F     -7 deg C     266 deg K
30 deg F     -1 deg C     272 deg K
40 deg F     4 deg C     278 deg K
50 deg F     10 deg C     283 deg K
60 deg F     16 deg C     289 deg K
70 deg F     21 deg C     294 deg K
80 deg F     27 deg C     300 deg K
90 deg F     32 deg C     305 deg K
100 deg F     38 deg C     311 deg K
 ok

Implementing Modules

I'm amazed at how little code I needed to implement my module system. You can find the complete source code here. Here's how the code breaks down:

Creating a module adds two new wordlists to the wordlist stack: one for public words and one for private words. Notably, the private word list is on top of the stack.

: module ( )
    wordlist >order ( public )
    wordlist >order ( private )
    public-words ;

The words public-words and private-words use set-current to set the wordlist that newly compiled words are appended to.

: public-words ( -- )
    get-order >r
    over set-current
    r> set-order ;
    
: private-words ( -- )
    get-order >r
    dup set-current
    r> set-order ;

And finally, publish invokes previous which drops the top wordlist, that is, the private wordlist.

That's it; that's the entire system. The code works because dropping the private wordlist removes the ability to execute its word by name, while references to it are tied to absolute IDs which are left untouched. I do believe I've added the functionality I was after while remaining in the spirit of Forth.

The syntactic sugar :private is perhaps some of the most elegant code I've ever written in any language. Check it out:

: :private private-words : public-words ;

The definition of :private simply switch to private words, defines the word using : and switches back to public words.

Lessons Learned

I'm quite pleased with this little detour to craft a module system. I not only built a useful abstraction that simplifies future problem solving, but I did so in a way that has, to me, highlighted a number of Forth's strengths.