Pandoc enables extension in a range of languages where an option such as Python is likely to be fairly widely palatable given the language popularity and provided supporting libraries. I personally tend to follow one of the two more direct options depending on context.
Lua filters are what I use by default in a team setting. They have the enormous advantage that the interpreter is built in to Pandoc and therefore there are no further dependencies. They have the potential disadvantages that Lua itself doesn't include many batteries (and therefore there tend to be local libraries for anything non-trivial) and that there's not typically a lot of design time safety so many errors may not be detected at run time. The latter can be addressed by a test harness (and should be in larger projects) - but in more limited use such testing may be unnecessary.
Haskell filters provide earlier feedback in a more powerful language. It's also the primary language of Pandoc and therefore activities such as skimming code or haddock documentation are more natural. It unfortunately expects the environment to be set up properly, and Haskell is likely to be less approachable for many developers. I there use Haskell primarily for personal use.
For now my need is to replace .org
with
.html
in link targets. This is likely simple enough that I
could use something like bash, but the Pandoc filter should also be
very simple and should be a bit more reliable.
The file starts off with the shebang per the filter documentation (1).
#!/usr/bin/env runhaskell
OverloadedStrings
The OverloadedStrings
pragma is used to help ease some
of the coercion across Text and Strings.
{-# LANGUAGE OverloadedStrings #-}
Import Data.Text
for some text manipulation
functions.
import Data.Text
Import Text.Pandoc.JSON to use filter support function.
import Text.Pandoc.JSON
The entrypoint for the script will make use of the Pandoc provided
toJSONFilter
.
This defines main as an IO type class and higher-order function invocation.
main :: IO ()
= toJSONFilter linkFixer main
Given that the focus is on links, the filter will deal with ~Inline~s.
linkFixer :: Inline -> Inline
Links whose targets end with .org
should have that
replaced with .html
which will make use of a guard
clause.
Other links will be passed through the subsequent pattern.
Link attr inline (href, title))
linkFixer (| ".org" `isSuffixOf` href =
Link attr inline (fixedHref, title)
where fixedHref = (dropEnd 4 href) <> ".html"
Nothing to see here.
= e linkFixer e
The content in this site is initially driven off of writing what amounts to a daily diary. I started to swap to this flow as part of using LogSeq (which for the time being I've shifted away from) and find it a good mechanism to both capture more information with less overhead and keep better track of what I've been doing (I've never been one to keep a journal…or pay much attention to things like time).
Certain sections of information will then be periodically distilled into pages. I'll likely try to keep the journals focused on activities and the pages more on results (with links or embeddings to relate the two processes). For this and just general cleanup - journal entries are subject to revision.
I'm capturing this information now since one of my next tasks will be integrating this filter into the work captured yesterday so it will be the first need to extract and clean-up some of the longer lived information.
The longer term goal is to produce a graph of information which will be hopefully accomplished through a combination of links and supporting tools.
At some point in the past few days I was also thinking that all of the journal files in the same directory could optimistically get a bit bloated over time. Splitting them out by year would be perfectly sufficient, but I suppose for consistency I'll split them out down to months. This may also be helpful if I end up wanting to attach some assets (like images) to journal entries (which is fairly likely).
I'll work on doing the three above which doesn't seem like it should take much time but will block updates until they're done.