How to create a Hacker News API GraphQL data source for GatsbyJS
Because GatsbyJS can query data only via GraphQL endpoints. Refer to Querying with GraphQL.
Let's make sure we are on the same page.
Now we've cleared some terms and concepts, let's' review Hacker News API.
The Official Hacker News API ("HN API" hereafter) exposes top level endpoints for "Top", "Best", and "New" stories.
Top level endpoints returns only IDs with no other data associated with it.
Calling "https://hacker-news.firebaseio.com/v0/topstories.json" returns an array of story IDs
[ 9127232, 9128437, 9130049, 9130144, 9130064, 9130028, 9129409, 9127243, 9128571, ..., 9120990 ]
So you'd need to make a call for each story ID returned from the top level endpoint. It's not an optimal design and HN team admits it. But I am thankful that HN team has provided a public API for their stories.
So with that in mind, let's move on to creating a source.
Now let's see how one can turn Hacker News API into a GraphQL Source by wrapping it as a Node by following steps below.
Let's get all top level story IDs from HN API.
There are duplicate stories in Top, New, and Best stories. So let's cache only distinct story IDs.
Getting all stories is as simple as calling an endpoint with story ID as part of the URL.
You are creating sources for "Top", "New", and "Best" stories where "data" contains arrays of story IDs that were fetched in previously.
We've now fetched all data, now let's create story nodes to expose it for GatsbyJS.
We've retrieved top/new/BestResults from the previous step, and we now use them to create nodes as shown above.
Let's take a look at the implementation of aptly named, createStoryNodes method.
The shape is defined by storyNode between line 4~11. Let's go over each property.
Remember that we defined getStories function but never called? items is a map of all stories fetched using getStories as shown below.
The code above fetches stories and caches them into a map, from which we can construct the stories with. A new Map object (not Array#map) is used for a constant time (O(1)) look up for an efficient data retrieval.
Content Digest (scroll down to "Parameters") helps GatsbyJS track whether data has been changed or not enabling it to be more efficient. The implementation of buildContentDigest is shown below.
It uses to serialize story into a hex representation using MD5 hashing algorithm. Honestly again, I used the implementation in the documentation as I don't know much about GatsbyJS's internal details.
Now you export the stories source for GatsbyJS at the bottom of gatsby-node.js file.
GatsbyJS passes a prop containing data property, which in turn contains actual data fetched using GraphQL.
Here is the full source code of gatsby-node.js.
The code might not be optimal at fetching data, but static site generator will cache it before generating sites so wouldn't affect the site performance in the end.
But I'd love to see if you have any suggestions on how to improve it :)