Cribl Edge & Search Homelab Part 2 - Configuring Edge
We now have Minio up and running, it’s time to configure Cribl Edge to collect some data. First, sign up for a Cribl.Cloud account. It’s free for up to 1TB/Day and 10 search executors. Signing up should be pretty straightforward so I’ll not document it here. I sign up with my Google ID, as it’s the simplest way to authenticate for me. Note, you can also run the Cribl leader yourself if you like, also free up to 10TB a day, but you won’t have access to Search as it is Cloud only. So, after logging in:
- Click
Manage Edge
- Click
Add/Update Edge Node
- Click
Bootstrap New
This should work by default, but a few considerations. You may want to create a new fleet, or setup a series of fleets with inheritance, but that’s outside the scope of this post. default_fleet
is the default and will work fine. Additionally, you may want to set Cribl to run as root
if you want to get all the logs. Of course, there are more secure ways of doing this, but that’s also outside the scope of this post.
- In the
Script
textbox, click the Copy icon in the upper right. - On the Linux node you want to monitor, paste in the script you copied.
- Note if you are running as root, make sure you
sudo -i
to root before running or add| sudo bash -
to your script.
- Note if you are running as root, make sure you
This only takes a few seconds, and data should start flowing. Edge by default will automatically collect application log files through autodiscovery. You should see data start flowing in the main page. Now, let’s configure out S3 output.
- Click
More>Destinations
- In the search box at the upper right, type
S3
- At the right, click
New Destination
- At the bottom, click
Manage as JSON
- Customize the below JSON for your local hostname and credentials you created in the last step
{
"id": "minio",
"systemFields": [
"cribl_pipe"
],
"streamtags": [],
"awsAuthenticationMethod": "manual",
"signatureVersion": "v4",
"reuseConnections": true,
"rejectUnauthorized": true,
"enableAssumeRole": false,
"stagePath": "$CRIBL_HOME/state/outputs/staging",
"addIdToStagePath": true,
"removeEmptyDirs": false,
"objectACL": "private",
"partitionExpr": "`${C.Time.strftime(_time ? _time : Date.now()/1000, '%Y/%m/%d/%H')}/${host}`",
"format": "json",
"baseFileName": "`CriblOut`",
"fileNameSuffix": "`.${C.env[\"CRIBL_WORKER_ID\"]}.${__format}${__compression === \"gzip\" ? \".gz\" : \"\"}`",
"maxFileSizeMB": 32,
"maxFileOpenTimeSec": 120,
"maxFileIdleTimeSec": 30,
"maxOpenFiles": 100,
"onBackpressure": "block",
"compress": "gzip",
"type": "s3",
"bucket": "YOURBUCKET",
"region": "us-east-1",
"awsApiKey": "YOURACCESSKEY",
"awsSecretKey": "YOURSECRETKEY",
"endpoint": "http://YOURHOST:9000",
"destPath": "''"
}
- Click
Save
- Click
Collect
at the top - Expand
File Monitor
- Drag
File Monitor>in_file_auto
to connect toS3>minio
- Leave set to
Passthrough
, ClickSave
- Do the same for
File Monitor>in_file_varlog
- Click
Commit and Deploy
- Click
Overview
You should start seeing data flow to S3 in the Edge main page. You can click Live
above the chart to see live data flowing through. You should also start seeing files show up in your data
bucket in Minio under a path that looks like 2022/12/11/14/host
. With the partioning structure we’ve setup, we should get reasonable performance narrowing by time and by host. If any of this fails for you, please check out the Cribl Community and come in and ask me (or anyone) a question.
Next, let’s expose Minio to the Internet.