Github as a CMS

The interior of a grand library

It's that magical time of year again, when I rewrite my personal website again. Fellow software engineers know that pain, of never being quite getting satisfaction. The knowledge that whether you spend the rest of your life working on one project, or dozens, you will never be done. The sisphyean wheel keeps on turning. I try to not to think about it, so let's focus on what I've changed this time around.

Rewrote it in Rust

The biggest difference is that I ported the entire site from Remix to Leptos, from a JS framework to a Rust one. I didn't do it because Remix is deficient, or because Rust is so much faster than Typescript. Although the Rust version should be a bit faster in rendering, and able to handle more load, like most websites the speed is usually bound by IO. The time to interact with the DB, ~100ms or so, signifigantly dwarfs the improvement in render speed. No, I did it because I honestly enjoy writing Rust more than Typescript. It's type system, compiler, and package manager are a joy to use.

I've also been spending a bunch of time contributing to Leptos, a Rust frontend framework inspired by SolidJS. I've built and help improved its integration with the Axum Rust web server, made routing more ergonomic through the creation of the LeptosRoutes trait, and made server fns more versatile through the use of alternative serialization formats like CBOR. I've really enjoyed doing that work, and I am extremely enthusiastic about the future of full stack Rust as an option for web developers.

Github as a CMS

The previous version of the site used an Sqlite database to store markdown content and post details, in a format that looks like this:

sql

CREATE TABLE IF NOT EXISTS posts (
  id         INTEGER PRIMARY KEY AUTOINCREMENT,
  user_id    INTEGER NOT NULL,
  title      TEXT NOT NULL,
  excerpt    TEXT,
  content    TEXT NOT NULL,
  tags       TEXT,
  slug       TEXT NOT NULL,
  published  INTEGER DEFAULT 0 NOT NULL,
  preview    INTEGER DEFAULT 0 NOT NULL,
  hero       TEXT,
  publish_date INTEGER,
  links      TEXT,
  created_at INTEGER DEFAULT(unixepoch()) NOT NULL,
  updated_at INTEGER DEFAULT(unixepoch()) NOT NULL,  
  FOREIGN KEY (user_id) REFERENCES users (id)
)STRICT;

It worked perfectly fine, but it had some drawbacks. Because Sqlite is basically a DB in a file, it meant that either I'd need to add new posts through the deployed app's post editor, or make the changes locally and redeploy the site with the updated DB file. If I edited the post in the deployed app, or added new content, I'd then need to push those changes from the app to Github again. I also couldn't use the lovely Markdown editing tools I've grown accustomed to, without a bunch of copy/pasting, and it enmeshed content with the layout. So I resolved to find an alternative that met these two goals

Posting new content would not require redeploying the site itself
I could use Typora or nvim or whatever Markdown tools to write with without having to build a post editor

If Github is my source of truth, why not eliminate the middleman? If I store posts in Github as Markdown files, I can update them separately, use all the tools I already use for code, and not worry about the status of the DB. Plus, Github has an API with a generous daily rate limit of ~5000calls/day. With proper caching and some tricks, I'll never get close to that.

Building It

I thought I'd have to investigate the Github API, but it turns out the Octocrab crate's done the work for me. Thanks XAMPPRocky! I figured that since the content is all text, I can just store it all in memory, and make it accessible to Leptos.

Rust code

#[derive(Clone, Debug, Serialize, Deserialize, Default)]
pub struct Posts {
    pub posts: IndexMap<String, Post>,
    pub last_checked: DateTime<Utc>,
}

I chose an IndexMap because I wanted to be able to sort the Posts by date published, and I wanted to be able to fetch content based on the post slug. A HashMap<String, Post> would have given me the second requirement, a Vec<Post> the first, and a BTreeMap<String, Post> is sorted by key. IndexMap lets me sort by date order while fetching via slug.

Since Leptos and Axum are multithreaded, and each incoming request could be handled by a seperate thread, we also need this object to be both thread safe and mutable. Thread safe means an Arc, and the mutability requirement means either a Mutex or a RwLock. I chose RwLock because writes will be rare, and we don't want to gate Post fetching from memory as much as possible.

Rust code

#[derive(Clone, Debug, Default)]
pub struct PostsContainer(pub Arc<RwLock<Posts>>);

Interior Mutability FTW

We'll need a Github repo to store posts, with a consistent format, like the one below. If we use the post slugs as the names of the folders and the markdown file names, we can easily query it. The posts folder here might be redundant, but if you wanted to have other content like images or supporting material, it'll keep things clean. I shall call the repo deranged_ramblings. It can be private, but keep in mind you'll need an API token to query it with the API.

.
├── README.md
└── posts
    ├── post-1
    │   └── post-1.md
    ├── post-2
    │   └── post-2.md
    └── post-3
        └── post-3.md

Now that we have a github repo, we can decorate our structs with functions to get that content

Rust code

impl Posts {
    /// If we have no posts, fetch them from the github repo
    pub async fn fetch_posts_from_github(&mut self) -> Result<(), BenwisAppError> {
      // If you're using a private repo, or want the full API limit, you'll want to generate a github token and provide it as an env var
      let api_token = std::env::var("GITHUB_TOKEN").expect("Failed to get Github Token");
        let octocrab = Octocrab::builder()
            .personal_token(api_token)
            .build()
            .unwrap();
      //
        let content = octocrab
            .repos("github_username", "deranged_ramblings")
            .get_content()
            .path("posts") // All our posts are in the post folder
            .send()
            .await
            .unwrap();

        // Get the path to all the post folders. Removes anything else that we don't need
      	// by path
        let content_paths: Vec<Content> = content
            .items
            .into_iter()
            .filter(|i| i.path.starts_with("posts/"))
            .collect();

        //Iterate over all the post folders
        for content in content_paths {
            let mut post_contents = octocrab
                .repos("github_username", "deranged_ramblings")
                .get_content()
                .path(content.path)
                .send()
                .await
                .unwrap();

            //Filter out all paths that do not end with .md
            post_contents.items.retain(|p| p.path.ends_with(".md"));

            let contents = post_contents.take_items();
            for item in contents {
                // Fetch it again individually, because it isn't included above
                // Not really sure why tbh
                let mut content = octocrab
                    .repos("github_username", "deranged_ramblings")
                    .get_content()
                    .path(item.path)
                    .send()
                    .await
                    .unwrap();
                let contents = content.take_items();
                let c = &contents[0];
                let decoded_content = c.decoded_content().unwrap();

                // You'll want to create a Post struct to represent your content
              	// and impl TryFrom<String> for Post so you can do the below
                let post: Post = decoded_content.try_into()?;

                self.posts.insert(post.slug.clone(), post);
            }
        }
        Ok(())
    }
  /// Get the post. If it's not in local, check github. If not in Github, return None
  /// TODO: Don't duplicate builders
    pub async fn fetch_post_from_github(&mut self, slug: &str) -> Result<Option<Post>, BenwisAppError> {
        let api_token = std::env::var("GITHUB_TOKEN").expect("Failed to get Github Token");
        let octocrab = Octocrab::builder()
            .personal_token(api_token)
            .build()
            .unwrap();
				// We know what we're querying, so this should be simpler
        let post_path = format!("posts/{}/{}.md", slug, slug);

         let mut content =  match octocrab
                    .repos("benwis", "benwis_posts")
                    .get_content()
                    .path(post_path)
                    .send()
                    .await
                    {
                        Ok(p) => p,
                        Err(_) => return Ok(None),
                    };
                let contents = content.take_items();
                let c = &contents[0];
                let decoded_content = c.decoded_content().unwrap();

                let post: Post = decoded_content.try_into()?;

                self.posts.insert(post.slug.clone(), post.clone());
                Ok(Some(post))
    }
}

Now let's create a constructor function to create PostsContainer with Posts included

Rust code

impl PostsContainer {
    pub async fn new_with_posts() -> Result<Self, BenwisAppError> {
        let mut posts = Posts::default();
        posts.fetch_posts_from_github().await?;
      	// Sort the posts by creation date
        posts.posts.sort_unstable_by(|_a, b, _c, d| d.created_at.partial_cmp(&b.created_at).unwrap());
        let container = PostsContainer(Arc::new(RwLock::new(posts)));
        Ok(container)
    }
}

Now that we've got the Post content done, let's plumb it into Axum and Leptos. The easiest way to do that is to create Posts, and then pass it to Axum's State system to make it available to all the handlers processing requests. The handlers can then put it in Leptos' Context to make it available there. That looks roughly like this.

Rust code

use axum::{
    extract::State,
    routing::{get, post},
    Router,
};
use std::sync::Arc;
use parking_lot::RwLock;
use leptos_axum::{generate_route_list, LeptosRoutes, handle_server_fns_with_context};
use leptos::{provide_context, get_configuration, LeptosOptions};

//Our State Object
#[derive(FromRef)]
struct AppState {
    posts: PostsContainer
  	options: LeptosOptions
}
// Create Posts, and populate it with Post content from Github
let posts = Posts::new_with_posts();
// Create our State object, now with Interior Mutability
let shared_state = AppState { 
	posts: Arc::new(RwLock::new(Posts::new(posts))),
};
#[tokio::main]
async fn main() {
    let conf = get_configuration(None).await.unwrap();
    let leptos_options = conf.leptos_options;
    let addr = leptos_options.site_addr;
    let routes = generate_route_list(TodoApp);

    // build our application with a route
    let app = Router::new()
    .route("/api/*fn_name", post(server_fn_handler))
    .leptos_routes_with_handler(routes, get(leptos_routes_handler) )
    .fallback(file_and_error_handler)
    .with_state(shared_state);

    // run our app with hyper
    // `axum::Server` is a re-export of `hyper::Server`
    logging::log!("listening on http://{}", &addr);
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
    }

Next we need to pass it to Leptos so that it can be used inside it. To do that, we'll write two Axum handlers to extract the state and pass it to Leptos before we render our app.

Rust code

// This handler handles regular page requests
async fn leptos_routes_handler(State(app_state): State<AppState>, req: Request<AxumBody>) -> Response{
            let handler = leptos_axum::render_app_to_stream_with_context(app_state.leptos_options.clone(),
            move || {
                provide_context( app_state.posts.clone());
            },
            || view! {  <BenwisApp/> }
        );
        handler(req).await.into_response()
    }
// This handler handles data and API requests to the server. We need both! 
async fn server_fn_handler(State(app_state): State<AppState>, path: Path<String>, headers: HeaderMap, raw_query: RawQuery,
    request: Request<AxumBody>) -> impl IntoResponse {

        log!("{:?}", path);

        handle_server_fns_with_context(path, headers, raw_query, move || {
            provide_context( app_state.posts.clone());
        }, request).await
    }

Great! Now all we need to do is write some server functions so that we can asynchronously load the content when the page is rendered.

Server Functions

Server Functions are neat little isomorphic functions that let you run a function on your server, and stream the result down to the client. The server macro handles serialization and deserialization of your data. The inputs to the function are what the client will need to provide, and the response is what the client will receive from the server. Once the client receives the result, it'll "hydrate" it into your code through the power of webassembly.

Let's write a server function that can get a particular number of posts, sorted in descending chronological order, from our in-memory Posts cache. If it can't find them, and enough time has passed, it'll retry to get the Posts.

Rust code

#[server(GetPosts, "/api")]
pub async fn get_posts(num: Option<usize>) -> Result<Result<Posts, BenwisAppError>, ServerFnError> {
    // Get Posts out of Leptos' Context
    let Some(posts) = use_context::<PostsContainer>() else {
        return Ok(Err(BenwisAppError::InternalServerError));
    };

    let reader = posts.0.read();

    // If there are no Posts, try to get some if we haven't checked in the last minute
  	// Wouldn't want to let people lock us out of the Github API
		// Technically it shouldn't be possible to get here, as Posts will always have content
    if reader.posts.len() == 0 && (Utc::now() - reader.last_checked) >= Duration::minutes(1) {
        log!("Refetching");
        let mut writer = posts.0.write();
        writer.fetch_posts_from_github().await?;
        writer.last_checked = Utc::now();

        // Sort Posts by created_at date in descending order
        writer
            .posts
            .sort_unstable_by(|a, b, c, d| d.created_at.partial_cmp(&b.created_at).unwrap());
    }

    let mut processed_posts = IndexMap::new();
    match num {
        Some(n) => reader.posts.iter().take(n).for_each(|(k, v)| {
            processed_posts.insert(k.to_owned(), v.to_owned());
        }),
        None => processed_posts = reader.posts.clone(),
    };

    let out = Posts {
        posts: processed_posts,
        last_checked: reader.last_checked.clone(),
    };
    Ok(Ok(out))
}

That cover's getting all the content for our cache, but often we want to get one Post for our post page

Rust code

#[server(GetPost, "/api")]
pub async fn get_post(slug: String) -> Result<Result<Option<Post>, BenwisAppError>, ServerFnError> {
    let Some(posts) = use_context::<PostsContainer>() else {
        return Err(ServerFnError::ServerError(
            "Failed to get Posts".to_string(),
        ));
    };

    let reader = posts.0.read();

    // If we can't find the post they want, try to get it from github
    if reader.posts.get(&slug).is_none()
        && (Utc::now() - reader.last_checked) >= Duration::minutes(1)
    {
        log!("Fetching {slug}");
        let mut writer = posts.0.write();
        writer.fetch_post_from_github(&slug).await?;
        writer.last_checked = Utc::now();

        // Sort Posts by created_at date in descending order
        writer
            .posts
            .sort_unstable_by(|a, b, c, d| d.created_at.partial_cmp(&b.created_at).unwrap());
    }

    let post = match reader.posts.get(&slug) {
        Some(p) => Some(p.to_owned()),
        None => None,
    };
    Ok(Ok(post))
}

Rendering a Post Page

I won't bore y'all too much more by providing all the places I use posts, but I will briefly discuss my Leptos post page.

Rust code

#[derive(Params, PartialEq, Clone, Debug)]
// We use this struct to get the path params, a la /posts/:slug
pub struct PostParams {
    pub slug: String,
}
#[component]
pub fn Post() -> impl IntoView {
    let params = use_params::<PostParams>();
    // A Blocking Resource lets you wait for the result of an async function
  	// before streaming the page down. It's good for SEO
  	// The first arg is what would cause it to rerun the function
  	// The second function is our get_post server_fn from earlier
  	let post = create_blocking_resource(
        move || params().map(|params| params.slug).ok().unwrap(),
        move |slug| get_post(slug),
    );

    view! {
        <Transition fallback=move || {
            view! {  <p>"Loading..."</p> }
        }>
            { move || post.get().map(|p|{ match p {
                Ok(Ok(Some(post))) => {
                    view! {  <PostContent post={post}/> }
                        .into_view()
                }
                Ok(Ok(None)) => {
                    view! {  <p>"Post Not Found"</p> }
                        .into_view()
                }
                Ok(Err(_)) => {
                    view! {  <p>"Server Error"</p> }
                        .into_view()
                }
                Err(_) => {
                    view! {  <p>"Server Fn Error"</p> }
                        .into_view()
                }
            }})
            }
        </Transition>
    }
}
// The component which will render our blog post. It takes a Post that we found in the component above
#[component]
pub fn PostContent(post: Post) -> impl IntoView {
    view! {
        <section class="">
            <div class="">
                <a href="/posts" class="">
                    "Back to Posts"
                </a>
            </div>
            {(post.preview || post.published)
                .then(|| {
                    view! {
                        <h1 class="">{post.title.clone()}</h1>
                        <div class="">{post.created_at.to_string()}</div>
                        <section class="">
                            <h2 class="">"Contents"</h2>
                            <div
                                class=""
                                inner_html={post.toc}
                            ></div>
                        </section>
                        <section
                            class=""
                            inner_html={post.content}
                        ></section>
                    }
                })}
        </section>
    }
}

Much like Remix's loaders, Leptos uses a Resource to keep track of the state of async functions inside our app. Here we're passing it our get_post() server function, so that it will keep track of the result. The Transition component here handles the view side of things, displaying Loading if it's loading, the content if it's done, and a variety of Error responses. Unlike Remix's loaders, you can have as many Resources as your heart desires, and different parts of the page can be hydrated independently. We use a create_blocking_resource() here because we want to wait for the post content to finish loading before we return the page response. This helps with SEO, as web crawlers will most llikely not hydrate your page to get the content.

Conclusion

I'm very happy with the rewrite, it should help reduce the friction to producing new content. I can work on my site in my favorite language, and I might get even get a little performance boost from fetching posts from memory instead of from sqlite.

If you're interested in seeing the whole site, it can be found here.

Github as a CMS

Contents

Rewrote it in Rust

Github as a CMS

Building It

Server Functions

Rendering a Post Page

Conclusion

Previous

A Rite of Passage: Compiling Markdown

Next

Create an RSS Feed in Axum or Leptos