This document describes the steps to push Nginx logs to S3 bucket via Fluentd. This can be done in 2 ways:
Install Fluentd by following these instructions.
Depending your Linux OS distribution, ensure that td-agent
is running:
# This command works on AmazonLinux and RedHat based systems.
sudo service td-agent status
● td-agent.service - td-agent: Fluentd based data collector for Treasure Data
Loaded: loaded (/usr/lib/systemd/system/td-agent.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2021-03-26 05:05:28 UTC; 2s ago
Docs: <https://docs.treasuredata.com/articles/td-agent>
Process: 5879 ExecStop=/bin/kill -TERM ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 5891 ExecStart=/opt/td-agent/bin/fluentd --log $TD_AGENT_LOG_FILE --daemon /var/run/td-agent/td-agent.pid $TD_AGENT_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 5996 (fluentd)
Tasks: 8
Memory: 126.9M
CGroup: /system.slice/td-agent.service
├─5996 /opt/td-agent/bin/ruby /opt/td-agent/bin/fluentd --log /var/log/td-agent/td-agent.log --daemon /var/run/td-agent/td-agent.pid
└─5999 /opt/td-agent/bin/ruby -Eascii-8bit:ascii-8bit /opt/td-agent/bin/fluentd --log /var/log/td-agent/td-agent.log --daemon /var/run/td-agent/td-agent.pid --under-supervisor
Mar 26 05:05:25 ip-172-31-7-162.ap-south-1.compute.internal systemd[1]: Starting td-agent: Fluentd based data collector for Treasure Data...
Mar 26 05:05:28 ip-172-31-7-162.ap-south-1.compute.internal systemd[1]: Started td-agent: Fluentd based data collector for Treasure Data.
Ensure that the process for which you want to send the logs is running. In this case we will use Nginx:
sudo service nginx status
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2021-03-23 09:02:54 UTC; 2 days ago
Main PID: 25379 (nginx)
Tasks: 2
Memory: 4.3M
CGroup: /system.slice/nginx.service
├─25379 nginx: master process /usr/sbin/nginx
└─25382 nginx: worker process
Mar 23 09:02:54 ip-172-31-7-162.ap-south-1.compute.internal systemd[1]: Starting The nginx HTTP and reverse proxy server...
Mar 23 09:02:54 ip-172-31-7-162.ap-south-1.compute.internal nginx[25363]: nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
Mar 23 09:02:54 ip-172-31-7-162.ap-south-1.compute.internal nginx[25363]: nginx: configuration file /etc/nginx/nginx.conf test is successful
Mar 23 09:02:54 ip-172-31-7-162.ap-south-1.compute.internal systemd[1]: Failed to read PID from file /run/nginx.pid: Invalid argument
Mar 23 09:02:54 ip-172-31-7-162.ap-south-1.compute.internal systemd[1]: Started The nginx HTTP and reverse proxy server.
Add the following config to /etc/td-agent/td-agent.conf
<source>
@type tail
path /var/log/nginx/access.log #...or where you placed your Apache access log
pos_file /var/log/td-agent/nginx-access.log.pos # This is where you record file position
tag nginx.access
format nginx
</source>
<source>
@type tail
path /var/log/nginx/error.log
pos_file /var/log/td-agent/nginx-error.log.pos
tag nginx.error
format nginx
# format /^(?<time>[^ ]+ [^ ]+) \\[(?<log_level>.*)\\] (?<pid>\\d*).(?<tid>[^:]*): (?<message>.*)$/
</source>
<match nginx.*>
@type "s3"
s3_bucket "$S3_BUCKET"
s3_region "$AWS_REGION"
path "logs/$INSTANCE_ID/%Y/%m/%d"
s3_object_key_format "%{path}/%{time_slice}_%{index}.log"
time_slice_format %Y%m%d%H%M
<format>
localtime false
</format>
<buffer time>
@type "file"
path "/var/tmp/fluentd/buffer/s3"
timekey $FLUENTD_CONFIG_S3_TIMEKEY
timekey_wait $FLUENTD_CONFIG_S3_TIMEKEY_WAIT
timekey_use_utc true
</buffer>
</match>
Update the following variables in the above config:
$S3_BUCKET
- Destination S3 bucket where to send logs. Should exist.$AWS_REGION
- AWS S3 bucket region$INSTANCE_ID
- can be EC2 instance ID or a unique identifier which you may chose to omit. This is used to identfy the system from which logs are being sent.$FLUENTD_CONFIG_S3_TIMEKEY
- Flush interval and how logs are grouped$FLUENTD_CONFIG_S3_TIMEKEY
- Flush delayRestart td-agent
. Ensure that Nginx is sending logs to have data for sending.
sudo service td-agent restart
Wait for $TIMEKEY
duration and then check the S3 bucket for logs.
You can use environment variables in fluentd config by following the format mentioned in the documentation.
To ensure that environment variables are picked up by td-agent
update /etc/sysconfig/td-agent
with these values e.g.
sudo touch /etc/sysconfig/td-agent
cat > /etc/sysconfig/td-agent
S3_BUCKET=nginx-logs
sudo service td-agent restart