Update README.md

Added examples and how-to

Update README.md
Added examples and how-to
86c4669d · Francesco Poldi · GitHub · d4672ee0 · 86c4669d
Commit 86c4669d authored Sep 27, 2018 by Francesco Poldi Committed by GitHub Sep 27, 2018
Hide whitespace changes
Inline Side-by-side

Showing with 25 additions and 13 deletions

elasticsearch/README.md elasticsearch/README.md +25 -13

No files found.
--- a/elasticsearch/README.md
+++ b/elasticsearch/README.md
@@ -32,40 +32,52 @@ Now that everything is up and running:
 1. Index some data: `python3.6 Twint.py --elasticsearch localhost:9200 -u user` (in this case `--elasticsearch` is mandatory argument and its value is a host:port combination, where the Elasticsearch instance is binding to);
-2. Now we can create the index (that I already created): open your browser and go to `http://localhost:5601` (again, this is a default value), `Dev Tools` tab, copy&paste `index-tweets.json` and than click the green arrow. Expected output is 
+2. Now we can create the index (that I already built): open your browser and go to `http://localhost:5601` (again, this is a default value), `Dev Tools` tab, copy&paste `index-tweets.json` and than click the green arrow. Expected output is 
 ```json
 {
  "acknowledged": true,
  "shards_acknowledged": true,
-  "index": "twint"
+  "index": "twinttweets"
 }
 ```
 3. Go to `Management` tab, `Index Patterns`, `Create Index Pattern`, `Index Pattern: twint` and choose `datestamp` as time field;
-4. Go to the `Discover` tab, choose `twint` and you should see something like this:
+4. Go to the `Discover` tab, choose `twinttweets` and you should see something like this:
 ![1](https://i.imgur.com/Ut9173J.png)
-PS: "twint" is just a custom name, feel free to change it accordingly at your needs, now as now the index name for tweets is `twinttweets`
+PS: this screenshot has the index named `tweep`, you will see `twinttweets`
-### Useful Tricks 
+### Query How-to 
-1. Filter out "multiplied" data and analyze only original tweets.
+1. Filter out "multiplied" data and analyze only own tweets.
-Useful when you want to study the activity of a user, in the `Search` bar type `NOT _exists_:likes NOT _exists_:retweets NOT _exists_:replies`
+If, during the indexing phase, you specified the `--es-count` param you could have the need of filtering-out the counting of likes/retweets/replies, to achieve this in the `Search` bar type `NOT _exists_:likes NOT _exists_:retweets NOT _exists_:replies`;
+2. Filter-out tweets for a specific username: `username: handle`, where `handle` is `@handle`;
+3. Filter-out tweets for a specific user_id: `user_id: 0123456`;
+4. Filter-out tweets for a specific word in the tweet: `tweet: osint`;
+5. Define specific timestamp intervals: click on the clock in the top right corner;
+6. Concatenate conditions: Lucene syntax has some logic built-in, operators like `AND` and `OR` are useful to restrict the data that you want to study;
+[Here](https://www.elastic.co/guide/en/kibana/current/lucene-query.html) a short article about Lucene Query Syntax.
+### Examples
+Search for every tweet from "@John" and "@Janet":
+`username: John AND username: Janet`
+Search for tweets from "myearthquakeapp" and restrict the result for earthquakes with magnitude between 5.0 and 5.9:
+`username: myearthquakeapp AND tweet: 5.?`
+Search for tweets with at least 5 likes:
+`nlikes: [5 TO *]` and similarly tweets with at least 1 like but less than 10 `nlikes: [1 TO 10]` (`[]` extremes included, `{}` extremes excluded)
 ### Ready-to-Use Visualizations
 With the newest versions of Kibana users can export objects, for example, but not limited to, visualizations and dashboards. 
 Making visualizations is a simple but not easy process, you have to combine how you want to index data and how you want to visualize it.
-To help you getting started with Twint and Elasticsearch, I made some basic visualization and a dashboard. To use them you have just to import them: go to `Management` tab, `Saved Objects`, `Import` and then select `dashboard_visualizations.json`. 
+To help you getting started with Twint and Elasticsearch, I made some basic visualization and a dashboard. To use them you have just to import them: go to `Management` tab (the gear), `Saved Objects`, `Import` and then select `dashboard_visualizations.json`. 
 After this just to go `Dashboard` tab and click on `Twint Dashboard`.
 ![2](https://i.imgur.com/QhqaENq.png)
-### Notes
-Different indexes can have different visualizations so there is not a general rule, with the basics provided in the Wiki you should be able to create visualizations. In any case, for every question, don't hesitate to ask.