Commit 86c4669d authored by Francesco Poldi's avatar Francesco Poldi Committed by GitHub

Update README.md

Added examples and how-to
parent d4672ee0
...@@ -32,40 +32,52 @@ Now that everything is up and running: ...@@ -32,40 +32,52 @@ Now that everything is up and running:
1. Index some data: `python3.6 Twint.py --elasticsearch localhost:9200 -u user` (in this case `--elasticsearch` is mandatory argument and its value is a host:port combination, where the Elasticsearch instance is binding to); 1. Index some data: `python3.6 Twint.py --elasticsearch localhost:9200 -u user` (in this case `--elasticsearch` is mandatory argument and its value is a host:port combination, where the Elasticsearch instance is binding to);
2. Now we can create the index (that I already created): open your browser and go to `http://localhost:5601` (again, this is a default value), `Dev Tools` tab, copy&paste `index-tweets.json` and than click the green arrow. Expected output is 2. Now we can create the index (that I already built): open your browser and go to `http://localhost:5601` (again, this is a default value), `Dev Tools` tab, copy&paste `index-tweets.json` and than click the green arrow. Expected output is
```json ```json
{ {
"acknowledged": true, "acknowledged": true,
"shards_acknowledged": true, "shards_acknowledged": true,
"index": "twint" "index": "twinttweets"
} }
``` ```
3. Go to `Management` tab, `Index Patterns`, `Create Index Pattern`, `Index Pattern: twint` and choose `datestamp` as time field; 3. Go to `Management` tab, `Index Patterns`, `Create Index Pattern`, `Index Pattern: twint` and choose `datestamp` as time field;
4. Go to the `Discover` tab, choose `twint` and you should see something like this: 4. Go to the `Discover` tab, choose `twinttweets` and you should see something like this:
![1](https://i.imgur.com/Ut9173J.png) ![1](https://i.imgur.com/Ut9173J.png)
PS: "twint" is just a custom name, feel free to change it accordingly at your needs, now as now the index name for tweets is `twinttweets` PS: this screenshot has the index named `tweep`, you will see `twinttweets`
### Useful Tricks ### Query How-to
1. Filter out "multiplied" data and analyze only original tweets. 1. Filter out "multiplied" data and analyze only own tweets.
Useful when you want to study the activity of a user, in the `Search` bar type `NOT _exists_:likes NOT _exists_:retweets NOT _exists_:replies` If, during the indexing phase, you specified the `--es-count` param you could have the need of filtering-out the counting of likes/retweets/replies, to achieve this in the `Search` bar type `NOT _exists_:likes NOT _exists_:retweets NOT _exists_:replies`;
2. Filter-out tweets for a specific username: `username: handle`, where `handle` is `@handle`;
3. Filter-out tweets for a specific user_id: `user_id: 0123456`;
4. Filter-out tweets for a specific word in the tweet: `tweet: osint`;
5. Define specific timestamp intervals: click on the clock in the top right corner;
6. Concatenate conditions: Lucene syntax has some logic built-in, operators like `AND` and `OR` are useful to restrict the data that you want to study;
[Here](https://www.elastic.co/guide/en/kibana/current/lucene-query.html) a short article about Lucene Query Syntax.
### Examples
Search for every tweet from "@John" and "@Janet":
`username: John AND username: Janet`
Search for tweets from "myearthquakeapp" and restrict the result for earthquakes with magnitude between 5.0 and 5.9:
`username: myearthquakeapp AND tweet: 5.?`
Search for tweets with at least 5 likes:
`nlikes: [5 TO *]` and similarly tweets with at least 1 like but less than 10 `nlikes: [1 TO 10]` (`[]` extremes included, `{}` extremes excluded)
### Ready-to-Use Visualizations ### Ready-to-Use Visualizations
With the newest versions of Kibana users can export objects, for example, but not limited to, visualizations and dashboards. With the newest versions of Kibana users can export objects, for example, but not limited to, visualizations and dashboards.
Making visualizations is a simple but not easy process, you have to combine how you want to index data and how you want to visualize it. Making visualizations is a simple but not easy process, you have to combine how you want to index data and how you want to visualize it.
To help you getting started with Twint and Elasticsearch, I made some basic visualization and a dashboard. To use them you have just to import them: go to `Management` tab, `Saved Objects`, `Import` and then select `dashboard_visualizations.json`. To help you getting started with Twint and Elasticsearch, I made some basic visualization and a dashboard. To use them you have just to import them: go to `Management` tab (the gear), `Saved Objects`, `Import` and then select `dashboard_visualizations.json`.
After this just to go `Dashboard` tab and click on `Twint Dashboard`. After this just to go `Dashboard` tab and click on `Twint Dashboard`.
![2](https://i.imgur.com/QhqaENq.png) ![2](https://i.imgur.com/QhqaENq.png)
### Notes
Different indexes can have different visualizations so there is not a general rule, with the basics provided in the Wiki you should be able to create visualizations. In any case, for every question, don't hesitate to ask.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment