k8s-CKS真题-故障排查Sysdig & falco

这篇具有很好参考价值的文章主要介绍了k8s-CKS真题-故障排查Sysdig & falco。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。


题目

Task:
使用运行时检测工具来检测 Pod tomcat123 单个容器中频发生成和执行的异常进程。
有两种工具可供使用:
    sysdig
    falco

注:这些工具只预装在 cluster 的工作节点 node02 上,不在 master 节点。
使用工具至少分析 30 秒 ,使用过滤器检查生成和执行的进程,将事件写到 /opt/KSR00101/incidents/summary 文件中,
其中包含检测的事件, 格式如下:
timestamp,uid/username,processName
保持工具的原始时间戳格式不变。

注:确保事件文件存储在集群的工作节点上。

请注意,考试时,考题里已表明 sysdig 在工作节点上,所以你需要ssh 到开头写的工作节点上

环境搭建

安装sysdig

在下方github参考链接的Release下载即可
博主下载的sysdig-0.31.5-x86_64.tar.gz,之后把可执行文件移动即可
命令

tar -zxf sysdig-0.31.5-x86_64.tar.gz
cp sysdig-0.31.5-x86_64/usr/bin/sysdig /usr/bin/sysdig

ubuntu 安装sysdig
命令

apt-get install -y sysdig

截图
k8s-CKS真题-故障排查Sysdig & falco

sysdig --help
sysdig version 0.31.5
Usage: sysdig [options] [-p <output_format>] [filter]

Options:
 -A, --print-ascii  Only print the text portion of data buffers, and echo
                    end-of-lines. This is useful to only display human-readable
                    data.
 -b, --print-base64 Print data buffers in base64. This is useful for encoding
                    binary data that needs to be used over media designed to
                    handle textual data (i.e., terminal or json).
 -B<bpf_probe>, --bpf=<bpf_probe>
                    Enable live capture using the specified BPF probe instead of the kernel module.
                    The BPF probe can also be specified via the environment variable
                    SYSDIG_BPF_PROBE. If <bpf_probe> is left empty, sysdig will
                    try to load one from the scap-driver-loader script.
 -c <chiselname> <chiselargs>, --chisel <chiselname> <chiselargs>
                    run the specified chisel. If the chisel require arguments,
                    they must be specified in the command line after the name.
 -cl, --list-chisels
                    lists the available chisels. Sysdig looks for chisels in the
                    following directories: ./chisels, ~/.chisels, /usr/share/sysdig/chisels.
  --cpus-for-each-buffer <cpus_num>
                    [EXPERIMENTAL] Please note this config regards only the modern BPF probe.
                    They are experimental so they could change over releases.
                    How many CPUs you want to assign to a single syscall buffer (ring buffer).
                    By default, every syscall buffer is associated to 2 CPUs, so the mapping is
                    1:2. The modern BPF probe allows you to choose different mappings, for
                    example, 1:1 would mean a syscall buffer for each CPU.
 -C <file_size>, --file-size=<file_size>
                    Before writing an event, check whether the file is
                    currently larger than file_size and, if so, close the
                    current file and open a new one. Saved files will have the
                    name specified with the -w flag, with a number after it,
                    starting at 0 and continuing upward. The units of file_size
                    are millions of bytes (10^6, not 2^20). Use the -W flag to
                    determine how many files will be saved to disk.
 --cri <path>       Path to CRI socket for container metadata
                    Use the specified socket to fetch data from a CRI-compatible runtime

 --cri-timeout <timeout_ms>
                    Wait at most <timeout_ms> milliseconds for response from CRI
 -d, --displayflt   Make the given filter a display one
                    Setting this option causes the events to be filtered
                    after being parsed by the state system. Events are
                    normally filtered before being analyzed, which is more
                    efficient, but can cause state (e.g. FD names) to be lost.
 -D, --debug        Capture events about sysdig itself, display internal events
                    in addition to system events, and print additional
                    logging on standard error.
 -E, --exclude-users
                    Don't create the user/group tables by querying the OS when
                    sysdig starts. This also means that no user or group info
                    will be written to the trace file by the -w flag.
                    The user/group tables are necessary to use filter fields
                    like user.name or group.name. However, creating them can
                    increase sysdig's startup time. Moreover, they contain
                    information that could be privacy sensitive.
 -e <num_events>    If used together with -w option, creates a series of dump files
                    containing only a specified number of events given in num_events
                    parameter each.
                    Used alongside -W flags creates a ring buffer of file containing
                    num_events each.
 -F, --fatfile      Enable fatfile mode
                    when writing in fatfile mode, the output file will contain
                    events that will be invisible when reading the file, but
                    that are necessary to fully reconstruct the state.
                    Fatfile mode is useful when saving events to disk with an
                    aggressive filter. The filter could drop events that would
                    the state to be updated (e.g. clone() or open()). With
                    fatfile mode, those events are still saved to file, but
                    'hidden' so that they won't appear when reading the file.
                    Be aware that using this flag might generate substantially
                    bigger traces files.
 --filter-proclist  apply the filter to the process table
                    a full dump of /proc is typically included in any trace file
                    to make sure all the state required to decode events is in the
                    file. This could cause the file to contain unwanted or sensitive
                    information. Using this flag causes the command line filter to
                    be applied to the /proc dump as well.
 -g, --gvisor-config
                    Parse events from gVisor using the specified configuration file.
                    A sysdig-compatible configuration file can be generated with --gvisor-generate-config
                    and can be used for both runsc and sysdig.
 --gvisor-generate-config [=<socket_path>(=/tmp/gvisor.sock)]
                    Generate a configuration file that can be used for gVisor.
 --gvisor-root <gvisor_root>
                    gVisor root directory for storage of container state. Equivalent to runsc --root flag.
 -G <num_seconds>, --seconds=<num_seconds>
                    Rotates the dump file specified with the -w option every
                    num_seconds seconds. Saved files will have the name specified
                    by -w which should include a time format as defined by strftime(3).
                    If no time format is specified, a counter will be used.
                    If no data format is specified, this can be used with -W flag to
                    create a ring buffer of events.
 -h, --help         Print this page
 -H <pluginname>[:<initconfig>], --plugin <pluginname>[:<initconfig>]
                    Registers a plugin, using the passed init config if present.
                    The format of initconf is controlled by the plugin, refer to each
                    plugin's documentation to learn about it.
                    A path can also be used as pluginname.
 -I <pluginname>[:<openparams>], --input <pluginname>[:<openparams>]
                    Set a previously registered plugin as input,
                    capturing events using it and passing the 
                    openparams string as open parameters.
                    Only a single source plugin can be registered.
                    If no plugins were registered, any found plugin in the directories
                    specified by ;-separated environment variable SYSDIG_PLUGIN_DIR and
                    in /usr/share/sysdig/plugins is registered; then use the provided one as input source.
                    The format of openparams is controlled by the plugin, refer to each
                    plugin's documentation to learn about it.
                    See https://falco.org/docs/plugins/plugin-api-reference/#ss-plugin-t-plugin-init-const-char-config-int32-t-rc-required-yes
                    and https://falco.org/docs/plugins/plugin-api-reference/#ss-instance-t-plugin-open-ss-plugin-t-s-const-char-params-int32-t-rc-required-yes for more infos.
                    The event sources available for capture vary depending on which 
                    plugins have been installed.
 -Il                Lists the loaded plugins. If no plugin has been registered through '-H',
                    Sysdig looks for plugins in the directories 
                    specified by ;-separated environment variable SYSDIG_PLUGIN_DIR and
                    in /usr/share/sysdig/plugins.
 -i <chiselname>, --chisel-info <chiselname>
                    Get a longer description and the arguments associated with
                    a chisel found in the -cl option list.
 -j, --json         Emit output as json, data buffer encoding will depend from the
                    print format selected.
 -k <url>, --k8s-api=<url>
                    Enable Kubernetes support by connecting to the API server
                    specified as argument. E.g. "http://admin:password@127.0.0.1:8080".
                    The API server can also be specified via the environment variable
                    SYSDIG_K8S_API.
 --node-name=<url>
                    The node name is used as a filter when requesting metadata of pods
                    to the API server; if empty, no filter is set
 -K <bt_file> | <cert_file>:<key_file[#password]>[:<ca_cert_file>], --k8s-api-cert=<bt_file> | <cert_file>:<key_file[#password]>[:<ca_cert_file>]
                    Use the provided files names to authenticate user and (optionally) verify the K8S API
                    server identity.
                    Each entry must specify full (absolute, or relative to the current directory) path
                    to the respective file.
                    Private key password is optional (needed only if key is password protected).
                    CA certificate is optional. For all files, only PEM file format is supported. 
                    Specifying CA certificate only is obsoleted - when single entry is provided 
                    for this option, it will be interpreted as the name of a file containing bearer token.
                    Note that the format of this command-line option prohibits use of files whose names contain
                    ':' or '#' characters in the file name.
                    Option can also be provided via the environment variable SYSDIG_K8S_API_CERT.
 -L, --list-events  List the events that the engine supports
 -l, --list         List the fields that can be used for filtering and output
                    formatting. Use -lv to get additional information for each
                    field.
 --libs-version     Print the falcosecurity/libs version
 --large-environment
                    Support environments larger than 4KiB
                    When the environment is larger than 4KiB, load the whole
                    environment from /proc instead of truncating to the first 4KiB
                    This may fail for short-lived processes and in that case
                    the truncated environment is used instead.
 --log-level=<trace|debug|info|notice|warning|error|critical|fatal>
                    Select log level. Useful together with --debug.
 --list-markdown    like -l, but produces markdown output
 -m <url[,marathon_url]>, --mesos-api=<url[,marathon_url]>
                    Enable Mesos support by connecting to the API server
                    specified as argument. E.g. "http://admin:password@127.0.0.1:5050".
                    Marathon url is optional and defaults to Mesos address, port 8080.
                    The API servers can also be specified via the environment variable
                    SYSDIG_MESOS_API.
--modern-bpf
                    [EXPERIMENTAL] Enable live capture using the modern BPF probe instead of
                    of the kernel module.
 -M <num_seconds>   Stop collecting after <num_seconds> reached.
 -n <num>, --numevents=<num>
                    Stop capturing after <num> events
 --page-faults      Capture user/kernel major/minor page faults
 --plugin-config-file
                    Load the plugin configuration from a Falco-compatible yaml file.
                    Do not mix this option with the '-H' or '-I' options: it is unsupported.
                    See the plugin section in https://falco.org/docs/configuration/ for
                    additional information
 -P, --progress     Print progress on stderr while processing trace files
 -p <output_format>, --print=<output_format>
                    Specify the format to be used when printing the events.
                    With -pc or -pcontainer will use a container-friendly format.
                    With -pk or -pkubernetes will use a kubernetes-friendly format.
                    With -pm or -pmesos will use a mesos-friendly format.
                    See the examples section below for more info.
 --plugin-info <pluginname>
                    Print info for a single plugin. This includes name, author,
                    and all the descriptive info of the plugin. If present,
                    this also prints the schema format for the init configuration
                    and a list of suggested open parameters.
                    All this info is controlled by the plugin, refer to each
                    plugin's documentation to learn more about it.
                    This can be combined with the -H option to load the plugin
                    with a given configuration.
                    A path can also be used as pluginname.
 -q, --quiet        Don't print events on the screen
                    Useful when dumping to disk.
 -R                 Resolve port numbers to names.
 -r <readfile>, --read=<readfile>
                    Read the events from <readfile>.
 -S, --summary      print the event summary (i.e. the list of the top events)
                    when the capture ends.
 -s <len>, --snaplen=<len>
                    Capture the first <len> bytes of each I/O buffer.
                    By default, the first 80 bytes are captured. Use this
                    option with caution, it can generate huge trace files.
 -t <timetype>, --timetype=<timetype>
                    Change the way event time is displayed. Accepted values are
                    h for human-readable string, a for absolute timestamp from
                    epoch, r for relative time from the beginning of the
                    capture, d for delta between event enter and exit, and
                    D for delta from the previous event.
 -T, --force-tracers-capture
                    Tell the driver to make sure full buffers are captured from
                    /dev/null, to make sure that tracers are completely
                    captured. Note that sysdig will enable extended /dev/null
                    capture by itself after detecting that tracers are written
                    there, but that could result in the truncation of some
                    tracers at the beginning of the capture. This option allows
                    preventing that.
 --unbuffered       Turn off output buffering. This causes every single line
                    emitted by sysdig to be flushed, which generates higher CPU
                    usage but is useful when piping sysdig's output into another
                    process or into a script.
 -U, --suppress-comm
                    Ignore all events from processes having the provided comm.
 -v, --verbose      Verbose output.
                    This flag will cause the full content of text and binary
                    buffers to be printed on screen, instead of being truncated
                    to 40 characters. Note that data buffers length is still
                    limited by the snaplen (refer to the -s flag documentation)
                    -v will also make sysdig print some summary information at
                    the end of the capture.
 --version          Print version number.
 -w <writefile>, --write=<writefile>
                    Write the captured events to <writefile>.
 -W <num>, --limit <num>
                    Used in conjunction with the -C option, this will limit the number
                    of files created to the specified number, and begin overwriting files
                    from the beginning, thus creating a 'rotating' buffer.

                    Used in conjunction with the -G option, this will limit the number
                    of rotated dump files that get created, exiting with status 0 when
                    reaching the limit. If used with -C as well, the behavior will result
                    in cyclical files per timeslice.
 -x, --print-hex    Print data buffers in hex.
 -X, --print-hex-ascii
                    Print data buffers in hex and ASCII.
 -z, --compress     Used with -w, enables compression for trace files.

Output format:

By default, sysdig prints the information for each captured event on a single
 line with the following format:

 %evt.num %evt.outputtime %evt.cpu %proc.name (%thread.tid) %evt.dir %evt.type %evt.info

where:
 evt.num is the incremental event number
 evt.time is the event timestamp
 evt.cpu is the CPU number where the event was captured
 proc.name is the name of the process that generated the event
 thread.tid id the TID that generated the event, which corresponds to the
   PID for single thread processes
 evt.dir is the event direction, > for enter events and < for exit events
 evt.type is the name of the event, e.g. 'open' or 'read'
 evt.info is the list of event arguments.

The output format can be customized with the -p switch, using any of the
fields listed by 'sysdig -l'.

Using -pc or -pcontainer, the default format will be changed to a container-friendly one:

%evt.num %evt.outputtime %evt.cpu %container.name (%container.id) %proc.name (%thread.tid:%thread.vtid) %evt.dir %evt.type %evt.info

Using -pk or -pkubernetes, the default format will be changed to a kubernetes-friendly one:

%evt.num %evt.outputtime %evt.cpu %k8s.pod.name (%container.id) %proc.name (%thread.tid:%thread.vtid) %evt.dir %evt.type %evt.info

Using -pm or -pmesos, the default format will be changed to a mesos-friendly one:

%evt.num %evt.outputtime %evt.cpu %mesos.task.name (%container.id) %proc.name (%thread.tid:%thread.vtid) %evt.dir %evt.type %evt.info

Examples:

 Capture all the events from the live system and print them to screen
   $ sysdig

 Capture all the events from the live system and save them to disk
   $ sysdig -w dumpfile.scap

 Read events from a file and print them to screen
   $ sysdig -r dumpfile.scap

 Print all the open system calls invoked by cat
   $ sysdig proc.name=cat and evt.type=open

 Print the name of the files opened by cat
   $ sysdig -p"%evt.arg.name" proc.name=cat and evt.type=open

 Register any found plugin and use dummy as input source passing to it open params
   $ sysdig -I dummy:10'

 Load and register dummy source plugin passing to it init config and open params
   $ sysdig -H dummy:'{"jitter":50}' -I dummy:10

创建容器

kubectl run tomcat123 --image=nginx

创建目录、文件

mkdir -p /opt/KSR00101/incidents/ && touch /opt/KSR00101/incidents/summary

解题 - sysdig

查看容器名字或ID

docker ps | grep tomcat123

如果没有docker,使用

crictl ps | grep tomcat123

如果都没有,可以使用

kubectl get po tomcat123 -oyaml | grep containerID

使用sysdig做检测
命令

sysdig -M 30 -p "%evt.time,%user.name,%proc.name" --cri /run/containerd/containerd.sock container.name=tomcat123 >> /opt/KSR00101/incidents/summary
sysdig -M 30 -p "%evt.time,%user.name,%proc.name" --cri /run/containerd/containerd.sock container.id= >> /opt/KSR00101/incidents/summary

截图
k8s-CKS真题-故障排查Sysdig & falco

解题 - falco

查看一下falco是否安装
命令

falco --help

截图
k8s-CKS真题-故障排查Sysdig & falco
查看tomcat123 pod是否存在
命令

kubectl get po

截图
k8s-CKS真题-故障排查Sysdig & falco

写一个falco规则
rule1.yaml

- rule: rule1
  desc: rule1
  condition: container.name = "tomcat123"
  output: "%evt.time,%user.name,%proc.name"
  priority: WARNING

截图
k8s-CKS真题-故障排查Sysdig & falco
命令

sudo falco -M 30 -r rule1.yaml >> /opt/KSR00101/incidents/summary

等待30秒,等待期间再开一个tab,进入容器执行ls等操作
截图
k8s-CKS真题-故障排查Sysdig & falco
k8s-CKS真题-故障排查Sysdig & falco

错误(centos下安装)

Unable to load the driver
error opening device /dev/scap0. Make sure you have root credentials and that the scap module is loaded: No such file or directory
modprobe scap
modprobe: ERROR: could not insert 'scap': Required key not available

看了一下,是操作系统层面的问题,直接在模拟环境下做题了。

模拟环境

k8s-CKS真题-故障排查Sysdig & falco
k8s-CKS真题-故障排查Sysdig & falco

参考

github-sysdig
wiki-How-to-Install-Sysdig-for-Linux
sysdig.com
Youtube-Sysdig Open Source - Getting Started With Csysdig
github-falco
falco-rules文章来源地址https://www.toymoban.com/news/detail-464034.html

到了这里,关于k8s-CKS真题-故障排查Sysdig & falco的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • K8S(KubeSphere)边做边学(一)——基础故障排查

    公司系统近1年开始转变为基于微服务的k8s部署结构,使用的是kubesphere。 由于公司系统迭代更新频率较高,且不时有新的私有化客户部署搭建,更新和部署过程中经常会遇到各类问题。对于研发出生,非运维专业又是半路出家学习了解K8S的我来说,一路磕磕碰碰,边学习边积

    2024年02月08日
    浏览(37)
  • K8S集群Node节点NotReay状态故障排查思路

    在K8S集群中,经常会出现Node节点处于NotReady的状态,当Node节点处于NotReady状态时,会导致该Node节点上的所有Pod资源停止服务,对整体应用程序会产生一定的影响。 在一个Node节点中可以运行多个Pod资源,每一个Pod资源中可以运行一个容器或者多个容器,同时共享一个网络存储

    2024年01月22日
    浏览(50)
  • K8S - 架构、常用K8S命令、yaml资源清单部署、Ingress、故障排查、存储卷

    K8S官网文档:https://kubernetes.io/zh/docs/home/ 学习东西还是要从官方文档入手; 用于管理、扩展、自动部署容器; 其实就是 对多个跨机器的Docker集群; 服务发现和负载均衡 Kubernetes 可以使用 DNS 名称或自己的 IP 地址来暴露容器。 如果进入容器的流量很大, Kubernetes 可以负载均

    2024年02月09日
    浏览(62)
  • K8S集群中Pod与Pod之间网络故障排查思路

    在K8S集群中,可能会出现Pod与Pod之间无法通信的现象,也就是说Pod无法跨Node主机进行通信,Pod与Pod之间网络不通讯会导致无法请求Pod中的服务,Apiserver也可能会无法获取Pod的运行状态,产生一系列问题。 Pod与Pod之间网络不通讯,很有可能是网络组件产生了异常导致的。 排查

    2024年01月16日
    浏览(40)
  • k8s故障排查个案:当Pod内存持续增长,OOM问题如何解决?

    pod 运行一段时间后,内存持续增长,甚至 oom 的情况. 容器化过程中,我们经常会发现 kubernetes 集群内 pod 的内存使用率会不停持续增长,加多少内存吃多少内存,如果对 cgroup 内存的构成不是很清楚的情况下,单纯看监控看不出什么问题。 经过一番查阅,目前总结出大致有

    2024年02月22日
    浏览(56)
  • K8S集群中Node节点资源不足导致Pod无法运行的故障排查思路

    故障一:Pod数量太多超出物理节点的限制 每一台Node节点中默认限制最多运行110个Pod资源,当一个应用程序有成百上千的Pod资源时,如果不扩容Node节点或者修改最大Pod数量限制,那么就会导致部分Pod资源无法正常运行,因为节点已经没有资源可以被调度了。 解决思路就是扩容

    2024年02月02日
    浏览(44)
  • nodejs前端项目部署到k8s,导致线上故障的排查与解决方法

    因我们的前后端项目都部署在k8s集群中,前端项目采用npm和node管理 事故背景: 某天前端同事在测试环境更新完一个前端服务后,访问正常,然后按照正常流程上线到生产环境,但是,在生产环境更新完成后,测试同事反馈访问报502错误,我就去服务器排查刚才发布的服务,检查pod状态

    2024年02月02日
    浏览(40)
  • 【故障排查】VMware挂起后恢复,k8s集群无法ping/curl通pod/svc/ingress

    一、master/node节点,去curl pod IP,一直卡着,没反应。timeout。 二、挂起恢复后,harbor服务无法正常访问503 ,需要重启harbor服务。 进容器curl localhost,是正常的。 而网络CNI 、flannel 、 coreDNS等都是running状态。 (发现restarts的次数有点多) .这里的metrics-server一直失败的。 可参考

    2023年04月17日
    浏览(42)
  • K8s进阶7——Sysdig、Falco、审计日志

    基本了解: 我们常常监控分析linux系统上的资源情况时会使用到一些工具,比如strace(诊断调试)、tcpdump(网络数据采集分析)、htop(系统性能监测)、iftop(网络流量、TCP/IP连接)、lsof (列出当前系统中进程打开的所有文件)。而Sysdig就是把这些功能汇聚在一起,成为一个非常强

    2024年02月06日
    浏览(37)
  • CKS之k8s安全基准工具:kube-bench

            CIS Kubernetes Benchmark 由互联网安全中心(CIS)社区维护,旨在提供 Kubernetes 的安全配置基线,旨在为互联网环境提供免费的安全防御方案。CIS是一个非营利性组织,其制定的安全基准覆盖了多个领域,包括操作系统、中间件、应用程序等多个层面。         CIS官网:

    2024年04月10日
    浏览(43)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包