Search

I got a question about the sym_cache table
What is exactly stored in this table?
And... is there a way to disable this cache function?
My table contains over 150.000 rows and is consuming too much serverspace

Hmmm strange I don't think it usually stores that much data. Do you maybe have a lot of RSS feeds(dynamic datasource)? that's the first thing off my mind that is stored in into sym_cache and probably if you are using jit for image resizing. If I am not mistaken that also caches the images (not sure if file or db though).

If ur cache is created on these two disabling cache wouldn't make a lot of sense as you'd have to recreate the image / dynamic datasource everytime

Hi gunglien,
The site calls a lot of rss feeds, so i think there's the problem
I don't use Jit, because that would mean resizing and caching hundreds of photo's a day.
The site has an avarage of 1000 pageviews per day, all created by dynamic datasources (rss). So you think disabeling the cache folder would not make a lot of difference?

If you switch off your cache table or something like that; what will happen is your RSS feed will never be cached - ie every time it is obtained from source which (on my site) takes around 1s to fetch so imagine for every user every request has to wait around 1s depending on the speed of the RSS.

So I would rather enable cache then not as gives better speed performance.

However as a side question do the urls of these feeds change in any way? because if this is the case you would be creating unused cache files if they change (eg if they have any time parameters in them etc etc.)

And when u say a lot of RSS feeds you are referring to something like 10 / 100 or more?

The cache table is just used for caching Dynamic XML Data Sources, nothing else to my knowledge. It sounds as though:

  • the Dynamic XML data sources you have are requesting a different URL for every user
  • there may be a problem in flushing out old entries from this table when they expire

What is your cache time on your data sources?

The cache limit on my datasource is 5 minutes... (I think i understand the amount of rows in the sym_cache now)

I think changing this to a week or a month would make a lot of differnce

there may be a problem in flushing out old entries from this table when they expire

This must be the case when having 150.000 entries in the sym_cache table

So increasing the cache limit + flushing old entries would pretty much solve the problem i guess. I can flush manualy, but it would be great if this could be implemented or fixed somehow.

Cheers

What version of Symphony are you running? Can you give an example of a URL that would be cached?

having 5 minutes shouldn't impact the size from my knowledge as I think the hash should be identical since it would be the same url; and it should clear the previous value.

This wouldn't hold if you are however passing a date / time parameter which changes every time or a user value as brendo above said.

To flush it would be pretty easy I think i can give you a minor extension (back-end that would allow you to flush manually from there as well as give the size) or something to randomize cache clear.

Another question - when you flush how long would it take you to get 150k entries? A couple of hours or days?

I am running symphony 2.2.4, I can't really show a live example right now cause i'd risk my account being suspended.

The site is just a personal experiment, so it's not very important. It's a template that gets dynamicly populated by a blogspot feed, based on the set url-parameter that corrsponds with the blogspot name. There are about a hundred blogspot sites indexed by google that are mirrored in the template,
So that comes to a 1000 unique visitors per day and about 1500 pageviews.

Another question - when you flush how long would it take you to get 150k entries? A couple of hours or days?

I just watched the sym_cache folder and it creates a new entry every second or so... (this is strange cause that would mean a lot more pageviews....)

I don't mind if the page takes a little longer to load because the ds is not cached.
I think in this case it's not really beneficial to cache the datasources because they change a lot and are different most of the time. So i would like to see what happens when datsourse caching is somehow disabled.

ahh so you have a different feed / blogspot user. And every persion then I presume sees a different blogspot feed?

Before you turn off the cache completely (will see if I can find how that is done) how often do you think a user visits/views the same feed. If that is anywhere between 1-5 times per 5 minutes (as your cache) then you are not gaining any performance at all from caching the entry; so you might very well delete it.

However if you have a couple of feeds that are highly visited whilst others have low views you might want to opt for a more complex solution where you have a small custom extension that keeps track of most visited feeds and on depending on frequency values decides if it should cache / not.

Maybe it starts no cache but if a feed gets 50+ visits/day eg you decide to cache it for 30min. If less then 50 then do not cache at all. This would reduce your size whilst not compromising performance

So you have a single datasource that's being passed a parameter that has around 100 different value?

This should mean 100 entries in the cache table that will expire in 'x' minutes. Where 'x' is what you have set on your datasource. Odd that this is not happening.

ahh so you have a different feed / blogspot user. And every persion then I presume sees a different blogspot feed?

That's correct

So you have a single datasource that's being passed a parameter that has around 100 different value?

Yes, the datsource got multiple variables, generating like a 1000 different pages (datasources) a day

There's a couple of popular feeds that have multiple pageviews, 25 views a day at the highest, and that's like 5%.
90% of the feeds get viewed 1 or 2 times a day.
This would mean that it's not really worth caching in this case i guess.

exactly most of the times its not worth caching. Also if max u get is 25 pageviews a day; this is also pretty useless; if u clear it up every 5 minutes and it is viewed in more then 5 minutes apart you are not getting any performance from the cache.

It could however slow down your system as if an item is expired if I am not mistaken your system will try to run an optimize on the table; so if you have performance problems maybe you are running it a bit too much. will have a look and see any modifications to the datasource that you can do :)

won't be able to do it today most probably though... as I had all my feeds customized and will need to set up a test on my local when I have some free time. alternatively what you can do is do some randomization and on a random factor call up a truncate sym_cache (so u keep the size under control for now) and include it in the datasource grab.

It could however slow down your system as if an item is expired if I am not mistaken your system will try to run an optimize on the table

I guess this would be very severe on the server with 150,000 rows being checked by every pageview, so if i am right it preformed this task over a 1000 times a day.
So i guess getting rid of the cache function is the answer in this case

In the meantime i will leave it offline so google will take the search results out of it's index, since the pages don't exist anymore (don't know how long this will take)

They won't get reindexed until i list the url's to the pages somewhere (that's what i did for like a hundred popular blogspot sites :)

will have a look and see any modifications to the datasource that you can do :)

That would be cool, but no hurry!

Thanks

In the meantime i will leave it offline so google will take the search results out of it's index, since the pages don't exist anymore (don't know how long this will take)

Might really take a while if u want them removed you have to use a robots.txt otherwise they will just show as 404 and google will keep trying (you can easily see this if you have a google webmaster account)

Well with 150k rows I don't think its an issue. however when you get into over 700k maybe yes. had one of my systems crash at that level however presumably it was for a db corruption; however while it runs it blocks all other access which could lead to other problems.

Good one, i just blocked the site with a robots.txt file

The site is on a shared host, so it was affecting other customers

The cache table is just used for caching Dynamic XML Data Sources, nothing else to my knowledge.

Not entirely true: there are some extensions that also make use of the caching table. However, reading the rest of the post it was probably not caused by an extension, but I just wanted to point it out ;-)

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details