Advanced search

Announcements about changes to the forums will be posted here. Also for suggestions and requests for technical assistance, etc.
Forum rules
Please read the Forum rules and policies before posting.
Post Reply
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Advanced search

Post by terkio »

I do not understand.
I search "cave story" with "Search for all terms or use query as entered".
Expecting posts which have "cave" AND "story"
I get tons of posts ( 15 pages ) most, only have "cave" , such a search result is about useless !
Am I doing something wrong ? What is the right query ?
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

I believe you need to use +cave +story
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

Hmm, the search does seem borked, it does seem to be doing an 'or' search despite all the comments to the contrary
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Re: Advanced search

Post by terkio »

Thanks,
Indeed, I had tried +cave +story to see it does an OR too.

I hope the advanced search will be fixed, it is frustrating when you do a search on words which are likely to be in all sorts of posts.
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

It's a basic function of phpbb, so sadly it would only get fixed when phpbb fixes it and we update the forum software to that version. Anyone else know of this issue/know if it's being addressed?
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

phpBB (the forum software) has a list of common words that are ignored in a search. This is to prevent overloading the database with searches for things like "and, of, then, it, as" etc.

If you search for "story" on its own, it's actually listed as a common word and is ignored. This is why your search doesn't work.

Common words are determined by the forum software automatically based on the contents of the entire forum. So because we've clearly used the word "story" thousands of times over the 12 years this forum's been alive, the software considers it common.

The only thing I can do is rebuild the search index and see if that helps. This can take hours and it may not even work. Fingers crossed...

EDIT: There is a bug tracker for this issue with some manual solutions (editing the database directly) -- I'll look at that as a last resort if the index rebuild doesn't work. The bug is closed as "Won't fix" so it's not something that will be improved.
http://tracker.phpbb.com/browse/PHPBB3-8175
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

UPDATE: There's actually a simple table in the database that lists all the indexed words and whether or not they are classed as common. I found "story" in there set as common, so I unset it and have now started to re-index the forum. Once the re-index finishes, the search for "cave story" should work (I think!)
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Re: Advanced search

Post by terkio »

Thanks,
I understand the reason of the PhpBB guys about common words, but what can I do to see wether the game cave_ story was already discussed on the forum ?
A search request for cave_story is changed into cave story
what can one do with common words like cave and story ? Obviously I cannot use other words, renaming the game I am talking about.
The best I can do is: Give me posts or topics where there is both the common words cave and story.
I am afraid the medecine from the PhpBB guys is killing the patient.
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

Wait and see if the changes I made actually work first ;-) Then we can address any other issues.

The index re-creation is still running and may take another 30 mins or so.
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

OK, index is rebuilt and a search for "cave story" and "cave_story" now works.

At some point I'll probably have to go through the entire common words list and remove anything else that shouldn't be classed as common. Doubt I'll ever get the time though..... ;-)
User avatar
Lord_BoNes
Jack of all trades
Posts: 1064
Joined: Mon Dec 01, 2008 12:36 pm
Location: Ararat, Australia.

Re: Advanced search

Post by Lord_BoNes »

Glad the search is fixed. Sounds like a fair job you've got ahead of you there Gambit... wish you the best of luck :)
 
Image

1 death is a tragedy,
10,000,000 deaths is a statistic.
- Joseph Stalin

Check out my Return to Chaos dungeon launcher
And my Dungeon Master Clone
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Re: Advanced search

Post by terkio »

Thanks, I hope this fix will be good for everybody, for more than just my personal search about cave story.
Sorry about the time you spent to fix it.
Is'n it simpler to remove completely the common word list feature invented by the PhpBB guys.
I think it cannot work because there are and will be, legitimate searches that cannot avoid common words.
Am I missing something ?
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

Interesting. Can someone cripple the site for a while running a search for 'and' or 'the'?
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

phpBB cross references words against posts where they are used, and builds an index list of all the words on the site. It uses some algorithm to calculate which words are "common" based on frequency of a word's use. So if you use a word many times, it becomes "common" as far as phpBB is concerned, even if we wouldn't typically consider the word "common". At least, that's how I understand it.

In theory it's possible to overload a site with regular searches for common words, and this is why the feature was built to limit that overloading problem. I think some words are automatically added as common before the index is even built, plus you can't search for words of 3 letters or less. So there are plenty of things built in to prevent overloading and to remain optimised, but some of these measures could be seen as "breaking" search -- 'cos you can't search for all words.

I'm not really concerned about it on this forum as we have such a small user base it's rarely a problem. If we were running a huge complex forum that contained tons of TLAs (Three Letter Acronyms), then yeah we could have a "broken" search that wouldn't find those TLAs, but for now I'm not even worrying about it :-P
User avatar
Lord_BoNes
Jack of all trades
Posts: 1064
Joined: Mon Dec 01, 2008 12:36 pm
Location: Ararat, Australia.

Re: Advanced search

Post by Lord_BoNes »

So that means that both RTC and DSB can't be searched for? Bummer.

I can see 1 major weakness to the "common word" approach... if someone were to bombard the forum with posts containing nothing but "break break break" etc... then the "common word" tactic would block the word "break" from being searched for. But, I'd imagine the posts containing such text would quickly be stomped by admins.
 
Image

1 death is a tragedy,
10,000,000 deaths is a statistic.
- Joseph Stalin

Check out my Return to Chaos dungeon launcher
And my Dungeon Master Clone
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

Actually, DSB and RTC come up just fine -- perhaps it's only 2 letter words that by default are ignored.

Plus I think my explanation is wrong: if it were correct, then Dungeon and Master would not show up in results, but they do.
User avatar
Lord_BoNes
Jack of all trades
Posts: 1064
Joined: Mon Dec 01, 2008 12:36 pm
Location: Ararat, Australia.

Re: Advanced search

Post by Lord_BoNes »

Fair point! Something tells me people would notice...
 
Image

1 death is a tragedy,
10,000,000 deaths is a statistic.
- Joseph Stalin

Check out my Return to Chaos dungeon launcher
And my Dungeon Master Clone
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

A brief look in to it makes it seem that as Gambit said the database should compose the common word list from our posts. I guess we use 'DM@ far more than dungeon or master :D

And the 3 / 2 word limit is a setting in the control panel, we can lower it more if we want, but it's currently set to 3 letter words.

I notice that we can disable it fully by telling it to make common 0% of the words then disabling the common word search, however I'm loathe to do that given this is the first time in many years we've encountered an issue. I'll leave it up to the more tech savvy admins to figure out the downsides of it all.
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

I just read all the comments on the bug report link above, the last one explains things a bit better:

"Common words only start to get marked after 100 posts."
http://tracker.phpbb.com/browse/PHPBB3- ... ment-29251

And this comment above it gives some more info on how these common word errors can creep in
http://tracker.phpbb.com/browse/PHPBB3- ... ment-29250
User avatar
Lord_BoNes
Jack of all trades
Posts: 1064
Joined: Mon Dec 01, 2008 12:36 pm
Location: Ararat, Australia.

Re: Advanced search

Post by Lord_BoNes »

1st hiccup in numerous years = "pretty damn good" in my opinion :P
 
Image

1 death is a tragedy,
10,000,000 deaths is a statistic.
- Joseph Stalin

Check out my Return to Chaos dungeon launcher
And my Dungeon Master Clone
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

Ah, I thought it meant the forum had 100 posts, not that the word had occurred in 100 posts! Interesting...
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Re: Advanced search

Post by terkio »

I see I opened a can of worms. I am sorry.

I had a look to a forum which has a lot of users on line, to see what it does against the search service overloading.
DIYaudio a site for audio and electronics geeks http://www.diyaudio.com/forums/
I did a search for the , it was rejected because the is a too common word. So far so good.
I did a search with amp, it was accepted giving near 100 000 posts, the search time was 40 seconds. amp is definetly a very common word, however the search doesn' t consider, it is a common word. :shock:
I made no more test searches, I do not want to hog their forum.

I don' t know wether this helps or brings more confusion. I am lost :oops:
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

Don't worry about it :-)
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

Actually, this is weird:

If you search for "dungeon master" as a phrase, you get lots of results returned and both words are highlighted in the matches results.
If you search for "master", you get a similar result.
But if you search for "dungeon" on its own, your get the "no results, word too common" response

....!?!?!?
User avatar
beowuuf
Archmastiff
Posts: 20687
Joined: Sat Sep 16, 2000 2:00 pm
Location: Basingstoke, UK

Re: Advanced search

Post by beowuuf »

Well, think about where most of the custom games are set....indeed what we call them... and think about the fact that I also ran a D&D game set in the same sort of environment....

I think 'dungeon' gets thrown around far, far more as a word than master does :D
User avatar
Gambit37
Should eat more pies
Posts: 13714
Joined: Wed May 31, 2000 1:57 pm
Location: Location, Location
Contact:

Re: Advanced search

Post by Gambit37 »

What I meant was it's weird that "dungeon master" as a phrase is matched just fine, but "dungeon" isn't. Especially considering the matches for "dungeon master" are the same as for "master".

Doesn't make any sense. phpBB is weird! :D
User avatar
terkio
Mon Master
Posts: 937
Joined: Tue Jul 10, 2012 8:24 pm

Re: Advanced search

Post by terkio »

Sure, a real time shared system must be protected against resource hogging.

I do not understand why they invent weird schemes where it is so simple to use quotas.
So simple to just set a maximum amount of responses to a request. ( and a minimum delay between requests from a user ).
"You can be on the right track and still get hit by a train!" Alfred E. Neuman
User avatar
Lord_BoNes
Jack of all trades
Posts: 1064
Joined: Mon Dec 01, 2008 12:36 pm
Location: Ararat, Australia.

Re: Advanced search

Post by Lord_BoNes »

@Gambit: If "dungeon" is considered a common word, then I'd reset it like you did for "story" above.
 
Image

1 death is a tragedy,
10,000,000 deaths is a statistic.
- Joseph Stalin

Check out my Return to Chaos dungeon launcher
And my Dungeon Master Clone
Post Reply