Search

This is still a rough idea... and may only ever be an idea since I haven't the slightest knowledge of how to code it... still, I'm hopeful that because there's a sort of 'engine' already available, someone can figure out how to plug it in to Symphony.

I've been thinking through the problem of teasers lately. How can we have teasers for entries that are of a configurable length, that are 'smart' in that they don't cut sentences or lose closing XHTML tags, and that don't require lots of redundant data input?

I know it's fairly trivial in XSLT to truncate strings while respecting word boundaries, but respecting sentence boundaries and cleaning up XHTML tags would be much more difficult. Plus, to do this solely on the XSLT end means having to load the full content of an entry as XML before truncating it.

As I poked around the internet for possible solutions, I came across a Drupal extension called Super Teaser. This chunk of PHP code truncates content based on a ranked system of conditions... making it more likely to favor sentence breaks over word breaks, for example, or paragraph breaks over sentence breaks. It also closes any HTML tags that are open at the location of the cut, allows you to set rigid floors or ceilings if need be, and allows manual breaking with the use of an HTML comment, much like Wordpress. Best of all, it's GPL. I was thinking we could use this engine and plug it into Symphony somehow.

I'm not sure what the best implementation would be. Right now, I'm imagining a custom textarea field that auto-generates teasers for itself. I'm not sure if it should database these (which would mean figuring out how to deal with changing the settings after teasers have already been generated)... or if it should generate them on-the-fly when requested—the idea here being that, in the list of fields to include in your DS output, for example, you might see:

  • Title
  • Body
  • Body (teaser)
  • etc

and the script runs when the DS is requested (I guess that would be a performance issue?). Anyway, I figured I'd get this written down and let people smarter then me decide what would be the best way to do it.

Let's keep [extension] for posts that contain finished extensions - Lewis

Well, I normally make a second field called "kicker" in the backend and put the teaser in there. However, this is also another field that has to be filled in. I forget where I was perusing, but that person's teaser was the entire first paragraph of their entry (not uncommon, but this time if finally clicked). Basically what I did after that, I just did a little xslt by calling the body area of my xml and limiting it to the first paragraph.

so then you don't have any broken content or unwanted, unwrapped elements. For a simple teaser with only text, it's pretty simple to implement too:

<xsl:copy-of select="body/*[1]" />

I also did something a little more complex with a thumbnail picture, but this is the default action if there isn't one. Only caveat here is that you still have to load the entire post into the xml.

Right... that might be a good simple way to do it. I'll have to think about it. The problem is that I'm dealing with articles from a scholarly journal and you can't even imagine how long-winded some of these opening paragraphs can be...

I use more or less the same as wtdtan:

<xsl:apply-templates select="body/*[1]"/>

...

<xsl:template match="body//*">
  <xsl:element name="{name()}">
    <xsl:apply-templates select="* | @* | text()"/>
  </xsl:element>
</xsl:template>

<xsl:template match="body//@*">
  <xsl:attribute name="{name(.)}">
    <xsl:value-of select="."/>
  </xsl:attribute>
</xsl:template>
...

plus shortening the paragraph:

...
<xsl:template match="p/text()">
  <xsl:value-of select="substring(.,1,111)" />
  <xsl:if test="string-length() &gt;= '111'">
    <xsl:text>&#8230;</xsl:text>
  </xsl:if>
  <a href="{$root}/{/data/its-datasource/entry/title/@handle}/" title="{title}" rel="chapter"> More » </a>
</xsl:template>

But there's one problem I have with substring(). For example if you have an encoded email link or something like that inside your fist paragraph:

<p>Sunny sun shines, bla bla... <script type="text/javascript" src="/workspace/js/script.js">...bla, bla... bla.<p>

In that case substring() counts till script and starts to count again from zero after ...js"> and ”More »“ is inserted twice, before the encoded link and at the teaser's end.

I guess, or at least I hope, this could be solved with XML/XSLT, too. But I didn't figure it out. Perhaps someone else can... ^_^

Czheng,

For your problem, isn't that pretty much exactly what abstracts are for? I know i've seen abstracts used for teasers elsewhere.

FWIW, I've been creating a discrete 'teaser' field, but I'm not intending it to be an actual preview of the same content, it's more custom-written for promoting click-thru.

I'm just trying to get us to think through what would be a smarter implementation in general (not necessarily in my case): i.e. one that doesn't require redundant data input, one that can be adjusted site-wide fairly easily, etc. I thought something like this would be a good addition to the Symphony developer's arsenal, so to speak.

And even though I didn't really start this thread to gather immediate solutions for myself, it's been very very helpful to see what others are doing, so thanks for that.

In fact, it strikes me that it could be a common thread type. Imagine:

[Best Practices] Entry Teasers or [Implementations] Entry Teasers

Those threads could then be poured into the Overture wiki once it's ready.

That's a good idea. Anything to centralize resources will be very good once it's time for the wiki. I have the feeling that it will need benefit from more conceptual descriptions of implementations in addition to step-by-step how-tos.

But there's one problem I have with substring().

It isn't so much substring() but the fact that your template matches each text node. You can eliminate the duplication by matching only the first text node.

If I have time, I'll play around with this and implement a teaser extension as a field type.

It isn't so much substring() but the fact that your template matches each text node. You can eliminate the duplication by matching only the first text node.

Yep, but then the second text node and the encoded link (which is in the example above ignored anyway) are sent to Coventry. (i.e. these three thingies are added up to a one sentence paragraph)

To get this solved, it should deal with the first and the second text node as it would be one text node and insert the script part between these two nodes once again later. It's a bit tricky and long-winded, for me at least.

Not sure how or if that can be solved (easily) anyway. If then I guess with-param, but I'm not a chief in that. Currently I bypass it with a manually edited choose part. This is definitely not the holy grail.

Hmm, I'll play around with some template recursion later.

czheng, I was speaking with a coworker the other day (not specifically for this thread, but related topic) and he said what you're trying to achieve is called "smart chomping". Now, i don't know how intensive this would be on the xslt proc, but I think he was telling me you could make a template (well, we were discussing freemarker, and there is something called a macro) and pass in a substring length and a string for what you want the substring to end in (i.e. ' '). so it will see if the last character is space, otherwise it goes backwards until it matches the space. (sorry, xslt, i've been having an affair with freemarker, the templating language for java * sob *)

Thanks wtdtan, and don't worry, XSLT gets around too ;)

pass in a substring length and a string for what you want the substring to end in (i.e. ' '). so it will see if the last character is space, otherwise it goes backwards until it matches the space.

Yup, That's what I've done so far to be able to limit by n characters without breaking a word. I'm trying to take the template little further and include child elements markup and not just text. Still working...

My attempt so far. It has some problems and does not include child markup. Still working...

<?xml version="1.0" encoding="UTF-8"?>

<s:stylesheet version="1.0" xmlns:s="http://www.w3.org/1999/XSL/Transform">
    <s:output method="xml"
  doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
  doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
  omit-xml-declaration="yes"
  encoding="UTF-8"
  indent="yes" />

    <s:template match="data">
        <s:apply-templates select="body" mode="teaser">
            <s:with-param name="total-width">100</s:with-param>
        </s:apply-templates>
    </s:template>

    <s:template match="body//*" mode="teaser">
        <s:param name="text" select="normalize-space(.)" />
        <s:param name="total-width" /> 

        <s:element name="{name()}">
            <s:if test="$text">
                <s:variable name="best-width">
                    <s:call-template name="choose-width">
                        <s:with-param select="$text" name="text" /> 
                        <s:with-param select="$total-width" name="width" /> 
                    </s:call-template>
                </s:variable>
                <s:value-of select="substring($text, 1, $best-width)" /> 
            </s:if>
        </s:element>
    </s:template>

    <s:template match="body//@*" mode="teaser">
        <s:attribute name="{name(.)}">
            <s:value-of select="."/>
        </s:attribute>
    </s:template>

    <s:template name="choose-width">
        <s:param name="text" /> 
        <s:param name="width" /> 

        <s:choose>
            <s:when test="$width = 0">
                <s:value-of select="string-length($text)" /> 
            </s:when>
            <s:when test="substring($text, $width, 1 ) = ' '">
                <s:value-of select="$width" /> 
            </s:when>
            <s:otherwise>
                <s:call-template name="choose-width">
                    <s:with-param select="$text" name="text" /> 
                    <s:with-param select="$width - 1" name="width" /> 
                </s:call-template>
            </s:otherwise>
        </s:choose>
    </s:template>

</s:stylesheet>

@lewis, why does your stylesheet have <s:whatever>? is this just a typo?

You don't have to use xsl as the namespace ;)

My attempt so far. It has some problems and does not include child markup. Still working...

Holy crap! You're Superman! :D

You don't have to use xsl as the namespace ;)

how does this work?

Also, i was digging around the General class (class.general.php) and saw this function: limitWords which says the following for its usage:

Method: limitWords
Description: truncates a string so that it contains no more than a certain
             number of characters, preserving whole words
Param: $string - string to operate on
           $maxChars - maximum number of characters
           $appendHellip (optional) - can optionally append a hellip entity 
                                      to the string if it is smaller than 
                                      the input string
Return: resultant string

could this help with an extension (if there is one created for this functionality)?`

wtd,

about the namespaces, you can declare the namespace prefix to be anything, not just'xsl'. Very handy for shortening things up a bit.

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details