Search

I had been looking into a way to store and display for my own purposes a set of topical text files, and had been eschewing somewhat-heavy local database “shoebox” type information managers such as DevonThink or Yojimbo on the Mac. A simple text file is enough for many purposes, so I initially tried Notational Velocity, because of its capability to sync to SimpleNote. Further, NV can be set to allow one to access the data folder (~/Library/Application Support/Notational Data) directly, so that markdown-formatted text files can be saved there with a regular text editor such as TextMate or SubEthaEdit, for a clean and easy workflow.

However, although I can indeed view SimpleNote on iPhone or via a web browser, I am uncomfortable with putting certain stuff in the cloud, and so would rather self-host so I have control over storage, backup and dissemination. I experimented with using Gollum, a ruby Git-based wiki, but it’s served via Sinatra, and so not so easy to secure out of the box with something like an .htaccess file. There are plenty of other static site generators as well, such as Jekyll or nanoc, which I am sure would also work fine.

But why not try it in Symphony CMS? Why not indeed. Nick Dunn generously provided the hints to get me started, and I was able to cobble together a (quite clumsy) PHP script that parses markdown files placed in a web folder, and presents the same in an XML file that can then be used as a Dynamic XML Data Source for Symphony CMS pages. I am not a coder, obviously, but this is the script:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="EN">
    <head>
        <title>PHP Script to Parse Markdown Text Files into XML</title>
    </head>
        <content>
            <?php
            // source markdown library
            include_once('markdown.php');
            // reference the directory
            $handle = opendir('.'); 
            // loop through files with certain extensions
            while (false !== ($file = readdir($handle))){ 
              $extension = strtolower(substr(strrchr($file, '.'), 1)); 
              if($extension == 'txt' || $extension == 'md' || $extension == 'markdown'){ 
                // grab and parse contents putting between node tags 
                $text = file_get_contents($file);
                echo('<node>' . Markdown($text) . '</node>');
              } 
            }
            ?>
        </content>
</html>

Triers will need to get the markdown.php library, and for better or worse, I just chucked that, the above php script, and a few markdown files, into the root of an exposed site folder. When I run the php file, it produces visible results in the browser, as well as producing visible results in the ?debug view for the Symphony CMS page it is attached to. That lets me do a few different things:

  • Expose said site folder via webdav, so I can just hook up to the folder via Finder and save markdown-formatted files there.
  • Connect to the server via sftp with a program such as ExpanDrive which would let me avoid fiddling with webdav.
  • Potentially make the exposed web folder a git repository, so I can work on a local clone, and commit from local to the server, to publish the markdown-formatted files in a location where Symphony CMS can then parse them.

A few challenges and ideas and questions come to mind already:

  1. I want a way to sort by file modify date, so I guess that would have to be extracted from the file system metadata somehow, in PHP during the loop. Is it better to extract and append that as part of the XML so Symphony CMS can handle it? Or is it better to try to sort the whole thing in PHP, so, say, the top entry is the last modified?
  2. If I wanted a clickable link, to display one of the markdown files, I would need a way to dynamically filter just the relevant info. Is it possible to refer to some attribute of the larger XML data source, in XSLT, to do this?
  3. It would be nice to know what formatting is best to do in PHP, versus what is best to do in Symphony XSLT templates.
  4. If I bring this into Symphony CMS as a dynamic XML data source, can I search the content?
  5. There is surely a cooler or cleaner way to write this PHP, to make the base XML output.

Any and all comments are most welcome and appreciated. Perhaps I can somehow improve this so it is useful to people who want to edit content in a text editor.

Sincerely,
Rick

I’m still weighing up a local storage for all my stuff (Yojimbo, Yep!, DevonThink et al) and yet to find the perfect solution. I am, however, looking to store more than text files.

But this is a neat idea. I’d suggest adding some more meta data to your XML so you can sort and filter. For example:

<?php

// ensure output is sent as XML
header('Content-Type: text/xml');
echo('<?xml version="1.0" encoding="utf-8" ?>');

include_once('markdown.php');

$handle = opendir('.'); 

echo('<files>');

while (false !== ($file = readdir($handle))){ 
    $extension = strtolower(substr(strrchr($file, '.'), 1));
    // only process markdown text files
    if(!in_array($extension, array('txt', 'md', 'markdown'))) continue;
    $text = file_get_contents($file);
    echo('<file name="' . $file . '" last-modified="' . date("Y-m-dTH:i:s", filemtime($file)) . '">' . Markdown($text) . '</file>');
}

echo('</files>');
?>

Attach this script as a Dynamic XML data source and you get the file list including file names and dates. Using XSLT you could choose to display a list of file names only, and sort them by this date. The following assumes you have a Symphony page “Markdown Files” that accepts a URL Parameter named filename:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output encoding="UTF-8" indent="yes" method="html" />

<xsl:param name="filename"/>

<xsl:template match="/">
    <xsl:choose>
        <xsl:when test="$filename=''">
            <ul>
                <xsl:apply-templates select="//file" mode="list">
                    <xsl:sort select="translate(@last-modified,'-T:','')" data-type="number" order="descending"/>
                </xsl:apply-templates>
            </ul>
        </xsl:when>
        <xsl:otherwise>
            <xsl:apply-templates select="//file[@filename=$filename]" mode="render"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

<xsl:template match="file" mode="filename-list">
    <li>
        <a href="/markdown-files/{@filename}/"><xsl:value-of select="@filename"/></a>
    </li>
</xsl:template>

<xsl:template match="file" mode="render">
    <h1><xsl:value-of select="@filename"/></h1>
    <xsl:copy-of select="*"/>
</xsl:template>

</xsl:stylesheet>

It should first list the files if the $filename URL Parameter is not set. The list of files will be sorted newest first, using xsl:sort. The translate() function should remove any non-numeric characters from the date string, leaving a number that can be sorted on. If the URL parameter is set, then the full file is shown.

This is how I’d approach it. Keep the data source as simple as possible by just providing the least amount of data required, but perform the manipulation and rendering of the data using the XSLT layer.

Wow, thank you very much, Nick!
I will find some time to experiment with this.
Much appreciated.

Regards,
Rick

I took @nickdunn's script above and combined it with a script that I found on Stack Overflow, that will allow you to recursively search for all Markdown files and then list and convert them into one XML document...

I have it on this GIST here.

Also, I made an attempt at updating the script so it would also work with YAML Front Matter in Markdown files and output the YAML Front Matter under a <meta> element.

I did this with my rudimentary PHP skills, so if folks can make it better, please do. ALSO, right now in the YAML Front Matter output in the XML... I have it static outputting a "title" and "uri". Since it will accept whatever you give it, I am trying to figure out how to dynamically produce the XML nodes for meta.

So, for Symphony purposes, if you have a few directories full of markdown files (with YAML Front Matter) that you want to make into a XML data source, you can do with this script.

I have created script to recursively read through a "content" directory of Markdown files that contain YAML Frontmatter (ala Jekyll or Statamic) and would like to create an XML document from it to use as a dynamic datasource.

PHP HELP NEEDED

I was wondering if you could look at my script, located here, https://github.com/bzerangue/markdownyamlxml/blob/master/index.php#L131-L138

I was wanting to dynamically create xml nodes based upon whatever YAML key:value pairs are available to me. Is there a way for my dynamically create the fetch statements to fetch me whatever key:value pairs are there for me to use

Unfortunately, my PHP skills are completely horrible and would love to learn more. If and only if you have time, would you mind looking at that script and giving me some pointers?

Brian, is there an example YAML file in the repo?

EDIT: ignore me! RTFR (Read the bloomin Repo) :)

Maybe Dayle Rees' repo for his static laravel based website document parser may help inspire some coding convention: Especially the parseMetadata function: github.com/daylerees/kurenai

@moonoo2 - Thanks for the response. Actually, Blaxus, the author of the PHP class for handling YAML frontmatter in PHP, helped me with how to get it to do what I wanted it to. I've updated my little example repo. This can be used to create XML for a directory full of markdown files with YAML Frontmatter.

Brian, Try using the fetchKeys() function mentioned here: https://github.com/Blaxus/YAML-FrontMatter/blob/master/frontmatter.php#L45

So take your $page variable and grab all meta keys as an object: $metadata = $page->fetchKeys()

Then loop over the $metadata array to grab the key value pairs.

Hah.. I'm too slow.

I see you made the $data public instead of private so that would work as well, here's what I did without altering the FrontMatter class:

$metadata = $page->fetchkeys();

       foreach($metadata as $key => $value)
        {
            # You want to skip the content item right?
                # $key = title
                # $value = $page->fetch('title')
          //var_dump($key);
                $metatitle = $doc->createElement(trim($key));
                $metatitle->appendChild( $doc->createTextNode($value));
               $meta->appendChild($metatitle);
        }

@moonoo2 - thanks for the input that helped a lot.

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details