Cribl Edge & Search Homelab Part 2 - Configuring Edge

We now have Minio up and running, it’s time to configure Cribl Edge to collect some data. First, sign up for a Cribl.Cloud account. It’s free for up to 1TB/Day and 10 search executors. Signing up should be pretty straightforward so I’ll not document it here. I sign up with my Google ID, as it’s the simplest way to authenticate for me. Note, you can also run the Cribl leader yourself if you like, also free up to 10TB a day, but you won’t have access to Search as it is Cloud only. So, after logging in:

  1. Click Manage Edge
  2. Click Add/Update Edge Node
  3. Click Bootstrap New

This should work by default, but a few considerations. You may want to create a new fleet, or setup a series of fleets with inheritance, but that’s outside the scope of this post. default_fleet is the default and will work fine. Additionally, you may want to set Cribl to run as root if you want to get all the logs. Of course, there are more secure ways of doing this, but that’s also outside the scope of this post.

  1. In the Script textbox, click the Copy icon in the upper right.
  2. On the Linux node you want to monitor, paste in the script you copied.
    • Note if you are running as root, make sure you sudo -i to root before running or add | sudo bash - to your script.

This only takes a few seconds, and data should start flowing. Edge by default will automatically collect application log files through autodiscovery. You should see data start flowing in the main page. Now, let’s configure out S3 output.

  1. Click More>Destinations
  2. In the search box at the upper right, type S3
  3. At the right, click New Destination
  4. At the bottom, click Manage as JSON
  5. Customize the below JSON for your local hostname and credentials you created in the last step
{
  "id": "minio",
  "systemFields": [
    "cribl_pipe"
  ],
  "streamtags": [],
  "awsAuthenticationMethod": "manual",
  "signatureVersion": "v4",
  "reuseConnections": true,
  "rejectUnauthorized": true,
  "enableAssumeRole": false,
  "stagePath": "$CRIBL_HOME/state/outputs/staging",
  "addIdToStagePath": true,
  "removeEmptyDirs": false,
  "objectACL": "private",
  "partitionExpr": "`${C.Time.strftime(_time ? _time : Date.now()/1000, '%Y/%m/%d/%H')}/${host}`",
  "format": "json",
  "baseFileName": "`CriblOut`",
  "fileNameSuffix": "`.${C.env[\"CRIBL_WORKER_ID\"]}.${__format}${__compression === \"gzip\" ? \".gz\" : \"\"}`",
  "maxFileSizeMB": 32,
  "maxFileOpenTimeSec": 120,
  "maxFileIdleTimeSec": 30,
  "maxOpenFiles": 100,
  "onBackpressure": "block",
  "compress": "gzip",
  "type": "s3",
  "bucket": "YOURBUCKET",
  "region": "us-east-1",
  "awsApiKey": "YOURACCESSKEY",
  "awsSecretKey": "YOURSECRETKEY",
  "endpoint": "http://YOURHOST:9000",
  "destPath": "''"
}
  1. Click Save
  2. Click Collect at the top
  3. Expand File Monitor
  4. Drag File Monitor>in_file_auto to connect to S3>minio
  5. Leave set to Passthrough, Click Save
  6. Do the same for File Monitor>in_file_varlog
  7. Click Commit and Deploy
  8. Click Overview

You should start seeing data flow to S3 in the Edge main page. You can click Live above the chart to see live data flowing through. You should also start seeing files show up in your data bucket in Minio under a path that looks like 2022/12/11/14/host. With the partioning structure we’ve setup, we should get reasonable performance narrowing by time and by host. If any of this fails for you, please check out the Cribl Community and come in and ask me (or anyone) a question.

Next, let’s expose Minio to the Internet.

Posts in this Series