<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Elastic search (and sphinx search before it) is something we have
been using since before 6.x when search kinda worked but was really
slow at scale because it did a SQL LIKE "%value%" search on every
field in every module, which is an unindexed brute force search,
with hundreds of thousands of records that was a serious load
causing massive CPU spikes on the server and slowing the system down
for other users. Yes, searching was rather broken in 6.x (it is now
fast, but doesn't search much) so we put a lot of work into turning
it into a store-quality module that respects security settings and
so on. We think that using an external specialised full text search
tool would be of value at scale even if the mysql based searching
was improved - I don't think that any query in mysql that starts
with % can use an index so is inherently brute force searching. It
might well be possible to do something in vtiger to nominate fields
to be searched on (so people could search for phone numbers for
example) and search more than it does at the moment but you are
always trading performance off against the amount you search.<br>
<br>
We should expand our test data, it looks more impressive at scale,
basically it doesn't slow down - so you can see what the performance
is like with just a few thousand records, it stays the same when it
gets huge. When we were talking about times to index things that is
the initial index loading process, what really matters is the search
response time after it is indexed - and for that it is *fast* and
additionally is not a heavy load on the mysql process (there is some
load as we check permissions for each record using the Vtiger API
but that is already heavily optimised)<br>
<br>
There are some core improvements that absolutely belong in the main
code, like the to_html performance, encrypted passwords for the
portal etc. Integrating a dedicated external search server isn't
quite like those.<br>
<br>
Alan.<br>
<br>
<br>
<div class="moz-cite-prefix">On 26/04/16 00:14, Błażej Pabiszczak
wrote:<br>
</div>
<blockquote
cite="mid:aa54ec54321eb3b11096a670c6605c30@yetiforce.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<p><span>I always have something to say, it has its pros and cons,
won't be different this time.</span><br>
<br>
<span>I've been thinking for a couple of minutes why you created
this functionality. I figured out that the only reasons for
doing so are:</span><br>
<span>1. Producer's reluctance to introduce changes, and
therefore moving away from the idea of improving the OS
system, just like vtexperts you go towards developing
commercial modules </span><br>
<span>2. Strictly financial. Searching was destroyed in v6.x to
such an extent, that you can easily make money.</span><br>
<br>
<span>If some basic functionalities like:</span><br>
<span>1 Searching</span><br>
<span>2. PDF printouts </span><br>
<span>3. Sending emails</span><br>
<span>4. etc</span><br>
<span>Are going to be available only as paid solutions, then if
you looked back at it in some time, you will notice it would
do more harm than good for the project, because a tiny company
that implements this system, would have to purchase a few
modules at the very beginning in order to start working with
it. If this is the direction Vtiger and community are going
[which is – free poor core and paid missing functionalities]
then let's be honest and stop calling it Open Source. I don't
mean your company, but all developer companies around vtiger,
that write modules. At this point you're already starting to
move in different directions. A few stores with modules were
created instead of just one. You offer the same
functionalities that are compatible neither business-wise, nor
logic-wise. And publishing paid modules that should be a part
of the system by default makes the system lose its value. </span><br>
<br>
<span>Going back to vlastic – in my opinion using ElasticSearch
is a bad choice, because it doesn't solve the basic problem,
which is poor searching in the engine. Instead of fixing the
problem by improving the existing system, you added a separate
tool, that is not an integral part of the system. And that
creates problems in several layers. 'Search' is a FUNDAMENTAL
mechanism and cannot be changed without changing the main
engine, and most of all – without significant changes in query
generator. </span><br>
<br>
<span>In other topics you wrote about tests on databases that
had 20k records. You launched a system with 120k test data.
Even though it's not the best testing environment [data were
generated in just one module, instead in all of them] it's
important to know that MySQL itself can easily deal with
larger databases that need no clustering. It means that a few
changes in Vtiger's core allow for much more than this module
currently offers, especially that improving the core gives
incomparable benefits.</span><br>
<br>
<span>Simple Vtiger code optimization might give significantly
better results than integration with any external tool.
ElasticSearch is used for completely different purpose than
what you used it for. I can give you tons of examples where
solutions directly in the core give more benefits than your
external tool, and what's most important- they're way faster
if they're optimized. </span><br>
<span>Take a look at how we developed our default search engine,
and why this solution is much better for the system, than
using ElasticSearch.</span></p>
<div>---<br>
<div>Z poważaniem / Regards</div>
<div> </div>
<div><strong>Błażej Pabiszczak</strong></div>
<div><em>Chief Executive Officer</em></div>
<div>M: +48.884999123<br>
E: <a moz-do-not-send="true" title="Mail do Błażej Pabiszczak"
href="mailto:b.pabiszczak@yetiforce.com">b.pabiszczak@yetiforce.com</a></div>
<hr>
<p><span>YetiForce 3.0 LTS has arrived! </span><a
moz-do-not-send="true"
href="https://gitdeveloper.yetiforce.com/" rel="noreferrer">Test</a><span> the
latest, most innovative open source system in the world, and
</span><a moz-do-not-send="true"
href="https://github.com/YetiForceCompany/YetiForceCRM"
rel="noreferrer">join</a><span> our community.</span></p>
</div>
<p> </p>
<p>W dniu 2016-04-21 11:18, Alan Lord napisał(a):</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left:
#1010ff 2px solid; margin: 0"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<div class="pre" style="margin: 0; padding: 0; font-family:
monospace"><span style="white-space: nowrap;">On 11/04/16 10:46, Alan Lord wrote:</span>
<blockquote type="cite" style="padding: 0 0.4em; border-left:
#1010ff 2px solid; margin: 0"><span style="white-space:
nowrap;">On 11/04/16 10:33, Sutharsan Jeganathan wrote:</span></blockquote>
<br>
<blockquote type="cite" style="padding: 0 0.4em; border-left:
#1010ff 2px solid; margin: 0">
<blockquote type="cite" style="padding: 0 0.4em;
border-left: #1010ff 2px solid; margin: 0"><span
style="white-space: nowrap;">I think the demo crm should have large volume of data to preview the</span><br>
<span style="white-space: nowrap;">effectiveness.i think this extension is for the CRM instances with</span><br>
<span style="white-space: nowrap;">larger data volume, correct?</span></blockquote>
<br>
<span style="white-space: nowrap;">That is a very good point. But we don't have any "safe" data. We have</span><br>
<span style="white-space: nowrap;">tested it on customer's systems with large volumes (> 1m records) but of</span><br>
<span style="white-space: nowrap;">course we cannot put that data on-line for all to see ;-)</span><br>
<br>
<span style="white-space: nowrap;">I will see if we can obtain some public data that we can use.</span></blockquote>
<br>
I have updated our demo server and added and indexed around
121,000 records. I have a lot more data I can add, but I don't
have time to prepare the data for import yet.<br>
<br>
<span style="white-space: nowrap;"><a moz-do-not-send="true"
href="http://geotools.libertus.co.uk">http://geotools.libertus.co.uk</a></span><br>
<br>
However, you should see that the search is still nice and fast
with 120k records :-) It doesn't seem to make a lot of
difference to Elasticsearch how many rows it has frankly.<br>
<br>
<span style="white-space: nowrap;">Hope this is helpful.</span><br>
<br>
A manual is available here: <a moz-do-not-send="true"
href="http://www.libertus.co.uk/images/documents/vlastic-manual.pdf">http://www.libertus.co.uk/images/documents/vlastic-manual.pdf</a><br>
<br>
Alan<br>
<br>
</div>
<br>
<div class="pre" style="margin: 0; padding: 0; font-family:
monospace">_______________________________________________<br>
<a moz-do-not-send="true" href="http://www.vtiger.com/">http://www.vtiger.com/</a></div>
</blockquote>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
<a class="moz-txt-link-freetext" href="http://www.vtiger.com/">http://www.vtiger.com/</a></pre>
</blockquote>
<br>
</body>
</html>