Building a Twitter Bot
Lesson 19: Cleaning up our previous work
So today I want to do something fun. Periodically, when developing software,
it's nice to do something that's not a regular-old-business-problem every
once in a while. Recently, I learned there was a bot that only tweeted
portions of "Africa" by "Toto". This is a grand idea, but there's just
one slightly better song to use: American Pie by Don McLean.
Now another interesting point is that Twitter recently doubled the character
limit for tweets: 140 -> 280 characters. So we could actually build our Don
McLean's American Pie bot to tweet up to that length of characters. So we're
going to go through the entire development lifecycle here, and we'll do it
all in F# rather quickly.
Step 1: identify the problem (plus solution in this case)
The first step is to identify what our "problem" to be solved is. Well, isn't
it obvious? No one has built a bot for American Pie by Don McLean yet! That's
a real problem! Our solution will be to build said bot, and allow it to tweet
the lyrics to American Pie on a regular basis.
One might think the first step is to browse the Twitter API, but not yet.
We'll get to that in a moment, the first step is to analyze the song and
determine how we want to split it.
I have heard this song so many times I happen to be able to type it from
memory, which I did, then I evaluated the lyrics for accuracy.
Now usually we would take this time to design a solution, but there really
isn't much to design, we basically need the following:
- Provide groupings of lyrics to the bot;
- Provide the timing / delay to the bot;
- Get it credentials;
- Periodically send a tweet of the next lyric group;
In larger software you would make this multiple steps, here we just make it
one because of how simple it is.
Step 2: provide groupings of lyrics to the bot
This is easy, and literally three lines of code in F#:
let path = @"C:\Users\Elliott Brown\Desktop\American Pie Lyrics.txt"
let file = System.IO.File.ReadAllText(path)
let parts = file.Split([|"\r\n\r\n"|], System.StringSplitOptions.None)
We literally load the file with the lyrics (attached to this post), and then
split it on double line breaks. That's it, we now have our lyric "parts". We
can verify that each part will fit into a tweet by testing:
let partLengths = parts |> Array.map Seq.length
We should verify that the largest group we have is 212 characters. This will
fit well within a tweet, so we should be good to continue.
Step 3: provide the timings / delays to the bot
For this, we want to check the Twitter API limiting information. We will
see that it tells us that we're limited to 2,400
tweets per day, for those
not keen on the math, that's 100 tweets an hour, which means we could send:
1.667
tweets per minute or 0.6
minutes between each tweet. We won't come
near that limit, we'll go with 5 minutes between each tweet, for a total of
20 per hour. This should be well within the API limitations, and it should
allow us to do exactly what we want.
Initially, we'll define this as follows:
let minutesBetweenTweets = 5.0
Step 4: get it credentials / API tokens
This is a little harder, we actually have to mess with Twitter for a moment
now. What we're going to do is head over to http://apps.twitter.com and
register an application. The rules have changed a while ago, and you now need
a phone number attached to your account to register an application. For
whatever reason twitter flagged the account I created and locked it, but it
was easy to unlock. (Not sure why, plenty of examples of folks doing
something like this.)
Alright, so once all that is done, we'll want to get a Consumer Key, Consumer
Secret, and generate an Access Token and Access Token Secret. I'm going to
use mine as the examples, but only for signature validation and I'm going to
be regenerating them afterwards.
let consumerKey = "AMbmXRe0nKymYOv23rzpBkggN";
let consumerSecret = "LIzIQNW79G5p8EyNbGjgxgkzvnjp7OImc6AdNKvbDPIPfReK5B";
let accessToken = "934423086732075008-qWUnoqnWByTTYJzKNrK8GT50MMYXE5B";
let accessTokenSecret = "fT1bo6TMVgLjLf74b16OIkdUAeyPamhk62si8QR1Xb2KJ";
Step 5: sign a request
The first thing we need to know when accessing the Twitter API is that all
requests are REST with special HTTP authorization headers. We also need to
"sign" requests, as documented on the Twitter Developers website.
To sign the request we need to know a few things:
- The HTTP Method (
POST
in our case);
- The raw endpoint URL (
https://api.twitter.com/1.1/statuses/update.json
in our case);
- The query-string and OAuth parameters and values:
- Query string:
status
- OAuth Parameters:
oauth_consumer_key
(consumerKey
);
oauth_nonce
(we'll generate this);
oauth_signature_method
(HMAC-SHA1
);
oauth_timestamp
(Unix Epoch current time);
oauth_token
(accessToken
);
oauth_version
(1.0
);
Once we have all that, we can begin the signing process. The next thing we
have to do is "percent encode" the invalid characters. Twitter has a handy
guide on doing this, which I have written code for below:
let percentEncode (str : string) =
str
|> Seq.collect (fun x ->
let cint = x |> int
match cint with
| 0x2D | 0x2E | 0x5F | 0x7E -> [|x|]
| cint when (cint >= 0x30 && cint <= 0x39)
|| (cint >= 0x41 && cint <= 0x5A)
|| (cint >= 0x61 && cint <= 0x7A) -> [|x|]
| cint -> (sprintf "%%%s" (cint.ToString("X"))).ToCharArray())
|> Seq.toArray
|> System.String
Now I didn't account for any Unicode symbols, but the top three examples on
that page should be properly encoded.
After we build this percent encoding, we need to sign our request. Signing
is not nearly as hard as it could be here, we actually have some examples to
fall back on. Basically, the process is as follows:
- Append the
POST
, Query String
, and OAUTH
parameters together;
- Sort alphabetically by key, then value (Twitter does not allow duplicate
keys across the three, so sorting by key will be the only requirement here);
- Percent encode all values (not keys, since keys have nothing requiring
encoding);
- Join the
key
and value
into key=value
;
- Join all the
key=value
strings into key=value&key2=value2...
;
- Build the base string: `method&URL (percent encoded)¶meters (percent
encoded again);
- Build the signing key:
consumerSecret&accessTokenSecret
, if
accessTokenSecret
is a blank string, leave the &
in the result;
- Using HMAC-SHA1 with the singing key, hash the base string;
- Base-64 encode the result;
This is actually surprisingly easy with F#, we can do each step on it's own,
or do a couple at a time. The key
/value
array sorting and such is pretty
simple, so I do them in a quick chain.
let sign method endpoint oauthParams queryParams postParams =
let keyValues =
oauthParams
|> Array.append queryParams
|> Array.append postParams
|> Array.sortBy fst
|> Array.map (fun (k, v) -> (k, v |> percentEncode))
let concatedStr =
("&", keyValues |> Array.map (fun (key, value) -> sprintf "%s=%s" key value))
|> System.String.Join
let baseStr = sprintf "%s&%s&%s" method (endpoint |> percentEncode) (concatedStr |> percentEncode)
let signKey =
sprintf "%s&%s" consumerSecret accessTokenSecret
|> System.Text.Encoding.ASCII.GetBytes
use hmacSha1 = new System.Security.Cryptography.HMACSHA1(signKey)
baseStr
|> System.Text.Encoding.ASCII.GetBytes
|> hmacSha1.ComputeHash
|> System.Convert.ToBase64String
As you can see, we do everything as best we can to maintain ease of
readability and follow what best-practices we must. This makes the entire
process quite painless, we pass in our parameters and get the signature
result.
Step 6: form a request
If we look at the status update API documentation, we'll see that the
status update in general is actually pretty simple. We just POST
to
https://api.twitter.com/1.1/statuses/update.json
with a query-string
parameter of status={{StatusText}}
, this is trivial with F#.
The first step is forming our OAuth string:
let formOAuthString p =
let paramsStr =
(", ", p
|> Array.sortBy fst
|> Array.map (fun (k, v) -> sprintf "%s=\"%s\"" (k |> percentEncode) (v |> percentEncode)))
|> System.String.Join
sprintf "OAuth %s" paramsStr
We know the values in this section won't have double-quotes, so there's no
need to guard here.
When dealing with F# (.NET) dates and times, we need to convert them to the
'unix epoch' times (which Twitter expects):
let timeToEpoch (time : System.DateTime) =
(time.ToUniversalTime() - System.DateTime(1970, 1, 1, 0, 0, 0)).TotalSeconds
Next, we form up our parameters and such for signing:
let url = "https://api.twitter.com/1.1/statuses/update.json"
let newStatus = parts.[0]
let newStatus = newStatus.Replace("\r\n", " / ")
printfn "%s" (newStatus |> percentEncode)
let timestamp = System.DateTime.Now |> timeToEpoch |> bigint
let nonce = System.Guid.NewGuid().ToString("N")
let oauthParams =
[|("oauth_signature_method", "HMAC-SHA1")
("oauth_version", "1.0")
("oauth_consumer_key", consumerKey)
("oauth_timestamp", timestamp.ToString())
("oauth_token", accessToken)
("oauth_nonce", nonce)|]
let postParams = [||]
let queryParams = [|("status", newStatus)|]
let signature = sign "POST" url oauthParams queryParams postParams
let oauthParams = [|("oauth_signature", signature)|] |> Array.append oauthParams
let oauthString = oauthParams |> formOAuthString
And that's it, we have the request formed. The final part is to do the POST
itself.
Step 7: POST the request
To POST the request to Twitter we basically do three things: create a
WebClient, add the Authorization
header; send the request. F# with .NET
makes this trivial, and we can even handle failure cases really easily.
try
use wc = new System.Net.WebClient()
wc.Headers.Add("Authorization", oauthString)
let response = wc.UploadData(sprintf "%s?status=%s" url (newStatus |> percentEncode), [||]) |> System.Text.Encoding.UTF8.GetString
printfn "Success: %s" response
with
| :? System.Net.WebException as ex ->
use sr = new System.IO.StreamReader(ex.Response.GetResponseStream())
printfn "Failure (HTTP %A): %A" ex.Status (sr.ReadToEnd())
| ex -> printfn "Failure: %A" ex
And that's it. Our Twitter bot now works. You'll also notice that I included
a .Replace("\r\n", " / ")
, Twitter is funky in how it treats line-breaks,
and if you actually include the real line breaks it doesn't respect them
properly. It also fails OAuth verification, so we have to do something about
that. My solution: replace them with a slash indicating line breaks, which
is somewhat frequently used in lyric sharing. I made this a separate call
for a reason: we're going to see in the next post how to turn that initial
parts.[0]
into a calculation as to which portion of the lyrics we are
currently in. (Such that we don't need the application to run constantly, we
can just run it once, and it will decide what the next lyric to post is.)
Step 8: clean things up
Now this entire application is about 90 lines for me, and works great, but
there's a lot of ugly there, because we don't import (see: open
) anything
from .NET, and we have to use the String.Join
which is just ugly. Let's
fix that up a bit:
module TwitterBot =
open System
open System.IO
open System.Net
open System.Security.Cryptography
open System.Text
let private consumerKey = "AMbmXRe0nKymYOv23rzpBkggN"
let private consumerSecret = "LIzIQNW79G5p8EyNbGjgxgkzvnjp7OImc6AdNKvbDPIPfReK5B"
let private accessToken = "934423086732075008-qWUnoqnWByTTYJzKNrK8GT50MMYXE5B"
let private accessTokenSecret = "fT1bo6TMVgLjLf74b16OIkdUAeyPamhk62si8QR1Xb2KJ"
let private url = "https://api.twitter.com/1.1/statuses/update.json"
let private encoding = Encoding.ASCII
let stringJoin sep (strs : string seq) = String.Join(sep, strs)
let timeToEpoch (time : DateTime) =
let epoch = DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc)
(time.ToUniversalTime() - epoch).TotalSeconds |> bigint
let percentEncode : string -> string =
Seq.collect (fun x ->
match x |> int with
| 0x2D | 0x2E | 0x5F | 0x7E -> [|x|]
| cint when (cint >= 0x30 && cint <= 0x39)
|| (cint >= 0x41 && cint <= 0x5A)
|| (cint >= 0x61 && cint <= 0x7A) -> [|x|]
| cint -> (sprintf "%%%s" (cint.ToString("X"))).ToCharArray())
>> Seq.toArray
>> String
let hmacSha1Hash (key : string) (str : string) : string =
use hmacSha1 = new HMACSHA1(key |> encoding.GetBytes)
str |> encoding.GetBytes |> hmacSha1.ComputeHash |> Convert.ToBase64String
let sign method endpoint oauthParams queryParams postParams =
let concatedStr =
Array.concat [|oauthParams; queryParams; postParams|]
|> Array.sortBy fst
|> Array.map (fun (key, value) -> sprintf "%s=%s" key (value |> percentEncode))
|> stringJoin "&"
[|method; endpoint; concatedStr|]
|> Array.map percentEncode
|> stringJoin "&"
|> hmacSha1Hash ([|consumerSecret; accessTokenSecret|] |> stringJoin "&")
let formOAuthString : (string * string) array -> string =
Array.sortBy fst
>> Array.map (fun (k, v) -> sprintf "%s=\"%s\"" (k |> percentEncode) (v |> percentEncode))
>> stringJoin ", "
>> sprintf "OAuth %s"
let buildTweetRequest tweet =
let timestamp = DateTime.Now |> timeToEpoch
let nonce = Guid.NewGuid().ToString("N")
let oauthParams =
[|("oauth_signature_method", "HMAC-SHA1")
("oauth_version", "1.0")
("oauth_consumer_key", consumerKey)
("oauth_timestamp", timestamp.ToString())
("oauth_token", accessToken)
("oauth_nonce", nonce)|]
let postParams = [||]
let queryParams = [|("status", tweet)|]
let oauthString =
[|("oauth_signature", sign "POST" url oauthParams queryParams postParams)|]
|> Array.append oauthParams
|> formOAuthString
(oauthString, sprintf "%s?status=%s" url (tweet |> percentEncode), String.Empty)
let run () =
let path = @"C:\Users\Elliott Brown\Desktop\American Pie Lyrics.txt"
let file = File.ReadAllText(path)
let parts = file.Split([|"\r\n\r\n"|], StringSplitOptions.RemoveEmptyEntries)
let newStatus = parts.[0]
let newStatus = newStatus.Replace("\r\n", " / ")
let oauthString, url, postParams = buildTweetRequest newStatus
try
use wc = new WebClient()
wc.Headers.Add("Authorization", oauthString)
let response = wc.UploadString(url, postParams)
printfn "Success: %s" response
with
| :? WebException as ex ->
use sr = new StreamReader(ex.Response.GetResponseStream())
printfn "Failure (%s): %A" ex.Message (sr.ReadToEnd())
| ex -> printfn "Failure: %A" ex
After all this rewriting, and making everything much clearer, my total line
count did not change. It literally didn't change at all (even with new open
statements), I did change total line-count during this change, that's huge. I
guess my point here is that you should never worry about the line count,
always write code readable and understandable first. Then deal with
trimming lines down if absolutely necessary. (I usually don't worry
about it until it's become unreasonable — several thousand lines, that
is.)
Step 9: admire our handiwork
You can see the result of our handiwork on Twitter, you'll start seeing
more and more tweets pop in there after we do the next lesson, when I
demonstrate how we can build a "smart algorithm" to decide what to Tweet
next.
The takeaway from this lesson should be two things:
1. We can actually do some fun / entertaining things in programming;
2. We always have another opportunity to practice;
I really want you to try to find something to do with programming that you
enjoy, then try to learn how to do it. You can pick anything, easy, hard,
whatever you want, just pick something that you like. The easiest way to
convince yourself to keep learning is to find something you enjoy, and work
towards it.
Also, worry not about the "security" issue of me sharing keys and secrets, I
regenerated all four before this post.