72 Star 355 Fork 215

GVPopenEuler / A-Tune

 / 详情

ECS上能运行A-Tune吗?

已完成
任务
创建于  
2020-07-29 21:01

我想在ECS(基于鲲鹏架构的openEuler操作系统)上运行A-Tune,

我按照https://gitee.com/openeuler/A-Tune上的指导安装A-Tune:

[root@openeuler A-Tune]# yum install -y atune 

然后执行如下命令:

[root@openeuler A-Tune]# systemctl daemon-reload 

[root@openeuler A-Tune]# systemctl start atuned 

[root@openeuler A-Tune]# systemctl status atuned  
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-07-29 20:07:04 CST; 7s ago
 Main PID: 7953 (atuned)
    Tasks: 14
   Memory: 124.8M
   CGroup: /system.slice/atuned.service
           ├─7953 /usr/bin/atuned
           ├─7960 python3 /usr/libexec/atuned/analysis/app.py /etc/atuned/atuned.cnf
           └─7964 /usr/bin/python3 -c from multiprocessing.semaphore_tracker import main;main(3)

Jul 29 20:07:05 openeuler atuned[7960]: 2020-07-29 20:07:05,618 [INFO] flask.app[line:38] : {'module': 'CPU', 'purpose': 'INFO', 'fmt': 'xml', 'path': '/usr/share/atuned/checker/cpu_in>
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: cpu_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: mem_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: cpu_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: mem_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: net_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: net_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7960]: 2020-07-29 20:07:06,991 [INFO] flask.app[line:38] : {'module': 'NET', 'purpose': 'TOPO', 'fmt': 'xml', 'path': '/usr/share/atuned/checker/net_to>
Jul 29 20:07:08 openeuler atuned[7953]: time="2020-07-29T20:07:08+08:00" level=info msg="memory total num is : 1" file="mem_topo.go:161"
Jul 29 20:07:08 openeuler atuned[7953]: time="2020-07-29T20:07:08+08:00" level=info msg="memory total num is : 1" file="mem_topo.go:161"


[root@openeuler A-Tune]# atune-adm analysis
 1. Analysis system runtime information: CPU Memory IO and Network...
 collect data faild
rpc error: code = Unknown desc = collect data faild

以上信息提示采集数据失败,请问是什么问题,是否ECS上不能运行A-Tune?

再次查看状态:

[root@openeuler A-Tune]# systemctl status atuned

会报告很多错误信息:

[root@openeuler A-Tune]# systemctl status atuned
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-07-29 20:07:04 CST; 7min ago
 Main PID: 7953 (atuned)
    Tasks: 15
   Memory: 127.0M
   CGroup: /system.slice/atuned.service
           ├─7953 /usr/bin/atuned
           ├─7960 python3 /usr/libexec/atuned/analysis/app.py /etc/atuned/atuned.cnf
           └─7964 /usr/bin/python3 -c from multiprocessing.semaphore_tracker import main;main(3)

Jul 29 20:07:06 openeuler atuned[7953]: time="2020-07-29T20:07:06+08:00" level=info msg="initializing checker service: net_topo" file="schedule.go:183"
Jul 29 20:07:06 openeuler atuned[7960]: 2020-07-29 20:07:06,991 [INFO] flask.app[line:38] : {'module': 'NET', 'purpose': 'TOPO', 'fmt': 'xml', 'path': '/usr/share/atuned/checker/net_to>
Jul 29 20:07:08 openeuler atuned[7953]: time="2020-07-29T20:07:08+08:00" level=info msg="memory total num is : 1" file="mem_topo.go:161"
Jul 29 20:07:08 openeuler atuned[7953]: time="2020-07-29T20:07:08+08:00" level=info msg="memory total num is : 1" file="mem_topo.go:161"
Jul 29 20:09:44 openeuler atuned[7960]: 2020-07-29 20:09:44,703 [INFO] flask.app[line:39] : {'sample_num': 20, 'monitors': [{'module': 'CPU', 'purpose': 'STAT', 'field': '--interval=5;>
Jul 29 20:09:46 openeuler atuned[7960]: 2020-07-29 20:09:46,171 [ERROR] analysis.plugin.monitor.memory.topo[line:55] : MemTopo.table_get_locator: Fail to find data
Jul 29 20:09:46 openeuler atuned[7960]: 2020-07-29 20:09:46,171 [ERROR] flask.app[line:1786] : Exception on /v1/collector [POST]
                                        Traceback (most recent call last):
                                          File "/usr/lib/python3.7/site-packages/flask/app.py", line 1838, in full_dispatch_request
                                            rv = self.dispatch_request()
                                          File "/usr/lib/python3.7/site-packages/flask/app.py", line 1824, in dispatch_request
                                            return self.view_functions[rule.endpoint](**req.view_args)
                                          File "/usr/lib/python3.7/site-packages/flask_restful/__init__.py", line 480, in wrapper
                                            resp = resource(*args, **kwargs)
                                          File "/usr/lib/python3.7/site-packages/flask/views.py", line 88, in view
                                            return self.dispatch_request(*args, **kwargs)
                                          File "/usr/lib/python3.7/site-packages/flask_restful/__init__.py", line 595, in dispatch_request
                                            resp = meth(*args, **kwargs)
                                          File "/usr/lib/python3.7/site-packages/flask_restful/__init__.py", line 722, in wrapper
                                            resp = f(*args, **kwargs)
                                          File "/usr/libexec/atuned/analysis/../analysis/resources/collector.py", line 48, in post
                                            mpis.append(MPI.get_monitor(monitor["module"], monitor["purpose"]))
                                          File "/usr/libexec/atuned/analysis/../analysis/plugin/plugin.py", line 100, in get_monitor
                                            mpis = MPI.get_monitors(module, purpose)
                                          File "/usr/libexec/atuned/analysis/../analysis/plugin/plugin.py", line 87, in get_monitors
                                            m_class = sub_class()
                                          File "/usr/libexec/atuned/analysis/../analysis/plugin/monitor/memory/bandwidth.py", line 98, in __init__
                                            self.__cnt["CPU0_Max"] = self.__get_theory_bandwidth(0) / 1024 / 1024
                                          File "/usr/libexec/atuned/analysis/../analysis/plugin/monitor/memory/bandwidth.py", line 156, in __get_theory_bandwidth
                                            locator = memtopo.table_get_locator(dimm["slot"])
                                          File "/usr/libexec/atuned/analysis/../analysis/plugin/monitor/memory/topo.py", line 56, in table_get_locator
                                            raise err
                                        LookupError: Fail to find data
Jul 29 20:12:49 openeuler atuned[7960]: 2020-07-29 20:12:49,135 [INFO] flask.app[line:39] : {'sample_num': 20, 'monitors': [{'module': 'CPU', 'purpose': 'STAT', 'field': '--interval=5;>
Jul 29 20:12:50 openeuler atuned[7960]: 2020-07-29 20:12:50,515 [ERROR] analysis.plugin.monitor.memory.topo[line:55] : MemTopo.table_get_locator: Fail to find data
Jul 29 20:12:50 openeuler atuned[7960]: 2020-07-29 20:12:50,516 [ERROR] flask.app[line:1786] : Exception on /v1/collector [POST]
                                        Traceback (most recent call last):
                                          File "/usr/lib/python3.7/site-packages/flask/app.py", line 1838, in full_dispatch_request
                                            rv = self.dispatch_request()
                                          File "/usr/lib/python3.7/site-packages/flask/app.py", line 1824, in dispatch_request
                                            return self.view_functions[rule.endpoint](**req.view_args)
                                          File "/usr/lib/python3.7/site-packages/flask_restful/__init__.py", line 480, in wrapper
                                            resp = resource(*args, **kwargs)
                                          File "/usr/lib/python3.7/site-packages/flask/views.py", line 88, in view
                                            return self.dispatch_request(*args, **kwargs)
                                          File "/usr/lib/python3.7/site-packages/flask_restful/__init__.py", line 595, in dispatch_request

查看进程atuned是在运行的:

[root@openeuler A-Tune]# ps -ef | grep atuned
root        7953       1  0 20:07 ?        00:00:00 /usr/bin/atuned
root        7960    7953  0 20:07 ?        00:00:01 python3 /usr/libexec/atuned/analysis/app.py /etc/atuned/atuned.cnf
root        8309    7219  0 20:16 pts/0    00:00:00 grep --color=auto atuned

于是我重启ECS:

[root@openeuler A-Tune]# reboot

然后执行如下命令:

[root@openeuler ~]# systemctl daemon-reload
[root@openeuler ~]# systemctl start atuned
Job for atuned.service failed because the control process exited with error code.
See "systemctl status atuned.service" and "journalctl -xe" for details.
[root@openeuler ~]# systemctl status atuned.service
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2020-07-29 20:19:12 CST; 37s ago
  Process: 2533 ExecStart=/usr/bin/atuned (code=exited, status=1/FAILURE)
 Main PID: 2533 (code=exited, status=1/FAILURE)

Jul 29 20:19:12 openeuler systemd[1]: Starting A-Tune Daemon...
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
Jul 29 20:19:12 openeuler systemd[1]: Failed to start A-Tune Daemon.
[root@openeuler ~]#  
See "systemctl status atuned.service" and "journalctl -xe" for details.

[root@openeuler ~]# systemctl status atuned.service
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2020-07-29 20:19:12 CST; 37s ago
  Process: 2533 ExecStart=/usr/bin/atuned (code=exited, status=1/FAILURE)
 Main PID: 2533 (code=exited, status=1/FAILURE)

Jul 29 20:19:12 openeuler systemd[1]: Starting A-Tune Daemon...
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
Jul 29 20:19:12 openeuler systemd[1]: Failed to start A-Tune Daemon.

以及:

[root@openeuler ~]# journalctl -xe
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- An ExecStart= process belonging to unit atuned.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 1.
Jul 29 20:19:12 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit atuned.service has entered the 'failed' state with result 'exit-code'.
Jul 29 20:19:12 openeuler systemd[1]: Failed to start A-Tune Daemon.
-- Subject: A start job for unit atuned.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit atuned.service has finished with a failure.
--
-- The job identifier is 598 and the job result is failed.
Jul 29 20:19:12 openeuler [2551]: [systemctl start atuned] return code=[1], execute failed by [root(uid=0)] from [pts/0 (119.3.119.18)]
Jul 29 20:19:14 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:19:24 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:19:34 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:19:44 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:19:49 openeuler systemd[1]: Configuration file /usr/lib/systemd/system/atuned.service is marked world-inaccessible. This has no effect as configuration data is accessible via>
Jul 29 20:19:49 openeuler [2589]: [systemctl status atuned.service] return code=[3], execute failed by [root(uid=0)] from [pts/0 (119.3.119.18)]
Jul 29 20:19:54 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:20:04 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:20:14 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:20:24 openeuler /usr/sbin/irqbalance[822]: IRQ virtio0 (40) guessed as class 0
Jul 29 20:20:24 openeuler systemd[1]: Starting system activity accounting tool...
-- Subject: A start job for unit sysstat-collect.service has begun execution
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit sysstat-collect.service has begun execution.
--
-- The job identifier is 666.
Jul 29 20:20:24 openeuler audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? t>
Jul 29 20:20:24 openeuler audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? te>
Jul 29 20:20:24 openeuler systemd[1]: sysstat-collect.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit sysstat-collect.service has successfully entered the 'dead' state.
Jul 29 20:20:24 openeuler systemd[1]: Started system activity accounting tool.
-- Subject: A start job for unit sysstat-collect.service has finished successfully
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit sysstat-collect.service has finished successfully.
--
-- The job identifier is 666.

以上信息表明启动A-Tune服务失败。请问是什么原因?

ECS系统信息:

[root@openeuler ~]# uname -a
Linux openeuler 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

我参考:

https://openeuler.org/zh/docs/20.03_LTS/docs/A-Tune/%E5%AE%89%E8%A3%85%E4%B8%8E%E9%83%A8%E7%BD%B2.html

请问是需要RPM数字签名签名吗?类似这样:

# rpm --import /mnt/RPM-GPG-KEY-openEuler

还是ECS上面做了限制,就运行不了A-Tune?

谢谢!

评论 (18)

woodrabbit 创建了任务
woodrabbit 关联仓库设置为openEuler/A-Tune
展开全部操作日志

Hey @woodrabbit, Welcome to openEuler Community.
All of the projects in openEuler Community are maintained by @openeuler-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/openeuler/community/blob/master/en/sig-infrastructure/command.md to find the details.

你好,@woodrabbit
openEuler 20.03 LTS版本默认带的A-Tune v0.2仅支持在鲲鹏物理机上运行,我们最近在master分支支持了虚拟机中运行atuned,可以试下是否能解决你的问题:)

你好,@woodrabbit
openEuler 20.03 LTS版本默认带的A-Tune v0.2仅支持在鲲鹏物理机上运行,我们最近在master分支支持了虚拟机中运行atuned,可以试下是否能解决你的问题:)

@谢志鹏 好的,我先去看看!谢谢啦!

你好,@woodrabbit
openEuler 20.03 LTS版本默认带的A-Tune v0.2仅支持在鲲鹏物理机上运行,我们最近在master分支支持了虚拟机中运行atuned,可以试下是否能解决你的问题:)

@谢志鹏 请问那个分支是“dev”吗?我必须下载源码编译安装这个分支版本吗?

我重新安装了dev分支再试了一次:

  1. 我的系统
[root@openeuler ~]# uname -a
Linux openeuler 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
  1. 安装A-Tune
yum install -y golang-bin python3 perf sysstat hwloc-gui
yum install -y python3-dict2xml python3-flask-restful python3-pandas python3-scikit-optimize python3-xgboost
mkdir -p /home/gopath/src
cd /home/gopath/src

git clone -b dev https://gitee.com/openeuler/A-Tune.git atune

cd atune/
export GO111MODULE=off
make
make install
  1. 启动atuned服务及查看状态
systemctl daemon-reload
systemctl start atuned

[root@openeuler atune]# systemctl status atuned
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-07-29 22:24:25 CST; 17s ago
 Main PID: 9677 (atuned)
    Tasks: 14
   Memory: 121.3M
   CGroup: /system.slice/atuned.service
           ├─9677 /usr/bin/atuned
           ├─9684 python3 /usr/libexec/atuned/analysis/app.py /etc/atuned/atuned.cnf
           └─9687 /usr/bin/python3 -c from multiprocessing.semaphore_tracker import main;main(3)

Jul 29 22:24:30 openeuler atuned[9684]: 2020-07-29 22:24:30,271 [ERROR] analysis.plugin.configurator.kernel_config.kconfig[line:66] : KernelConfig._get: not find one CONFIG_NUMA_AWARE_>
Jul 29 22:24:30 openeuler atuned[9684]: 2020-07-29 22:24:30,273 [ERROR] analysis.plugin.configurator.common[line:195] : KernelConfig.get: not find one CONFIG_NUMA_AWARE_SPINLOCKS
Jul 29 22:24:30 openeuler atuned[9684]: 2020-07-29 22:24:30,274 [ERROR] analysis.plugin.configurator.common[line:97] : KernelConfig.set: Please change the kernel configuration CONFIG_N>
Jul 29
  1. 修改配置文件
vi /etc/atuned/atuned.cnf

cat /etc/atuned/atuned.cnf
# Copyright (c) 2019 Huawei Technologies Co., Ltd.
# A-Tune is licensed under the Mulan PSL v1.
# You can use this software according to the terms and conditions of the Mulan PSL v1.
# You may obtain a copy of Mulan PSL v1 at:
#     http://license.coscl.org.cn/MulanPSL
# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR
# PURPOSE.
# See the Mulan PSL v1 for more details.
# Create: 2019-10-29

#################################### server ###############################
# atuned config
[server]
# the protocol grpc server running on
# ranges: unix or tcp
protocol = unix

# the address that the grpc server to bind to
# default is unix socket /var/run/atuned/atuned.sock
# ranges: /var/run/atuned/atuned.sock or ip address
address = /var/run/atuned/atuned.sock

# the atune nodes in cluster mode, separated by commas
# it is valid when protocol is tcp
# connect = ip01,ip02,ip03

# the atuned grpc listening port
# the port can be set between 0 to 65535 which not be used
# port = 60001

# the rest service listening port, default is 8383
# the port can be set between 0 to 65535 which not be used
rest_port = 8383

# when run analysis command, the numbers of collected data.
# default is 20
sample_num = 20

# enable gRPC and http server authentication SSL/TLS
# default is false
# tls = true
# tlsservercertfile = /etc/atuned/server.pem
# tlsserverkeyfile = /etc/atuned/server.key
# tlshttpcertfile = /etc/atuned/http/server.pem
# tlshttpkeyfile = /etc/atuned/http/server.key
# tlshttpcacertfile = /etc/atuned/http/cacert.pem

#################################### log ###############################
[log]
# either "debug", "info", "warn", "error", "critical", default is "info"
level = info

#################################### monitor ###############################
[monitor]
# with the module and format of the MPI, the format is {module}_{purpose}
# the module is Either "mem", "net", "cpu", "storage"
# the purpose is "topo"
module = mem_topo, cpu_topo

#################################### system ###############################
# you can add arbitrary key-value here, just like key = value
# you can use the key in the profile
[system]
# the disk to be analysis
disk = vda

# the network to be analysis
network = eth0

user = root

以上配置我只是让disk=vda, network=eth0,因为:

[root@openeuler ~]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.88  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::f816:3eff:feef:3eb  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:ef:03:eb  txqueuelen 1000  (Ethernet)
        RX packets 635  bytes 77676 (75.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 648  bytes 65560 (64.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@openeuler ~]# fdisk -l | grep dev
Disk /dev/vda: 40 GiB, 42949672960 bytes, 83886080 sectors
/dev/vda1     2048  2099199  2097152   1G EFI System
/dev/vda2  2099200 83884031 81784832  39G Linux filesystem
Disk /dev/vdb: 10 GiB, 10737418240 bytes, 20971520 sectors

然后重启ECS:

reboot

重启后启动atuned:

[root@openeuler ~]# systemctl start atuned
Job for atuned.service failed because the control process exited with error code.
See "systemctl status atuned.service" and "journalctl -xe" for details.
[root@openeuler ~]#

启动失败。

[root@openeuler ~]# systemctl status atuned.service
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2020-07-30 09:12:03 CST; 1min 25s ago
  Process: 2710 ExecStart=/usr/bin/atuned (code=exited, status=1/FAILURE)
 Main PID: 2710 (code=exited, status=1/FAILURE)

Jul 30 09:12:02 openeuler systemd[1]: Starting A-Tune Daemon...
Jul 30 09:12:03 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
Jul 30 09:12:03 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
Jul 30 09:12:03 openeuler systemd[1]: Failed to start A-Tune Daemon.


[root@openeuler ~]# journalctl -xe
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit sysstat-collect.service has successfully entered the 'dead' state.
Jul 30 09:10:38 openeuler systemd[1]: Started system activity accounting tool.
-- Subject: A start job for unit sysstat-collect.service has finished successfully
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit sysstat-collect.service has finished successfully.
--
-- The job identifier is 880.
Jul 30 09:11:45 openeuler [2630]: [ps -ef | grep atuned] return code=[0], execute success by [root(uid=0)] from [pts/0 (119.3.119.18)]
Jul 30 09:12:02 openeuler systemd[1]: Configuration file /usr/lib/systemd/system/atuned.service is marked world-inaccessible. This has no effect as configuration data is accessible via>
Jul 30 09:12:02 openeuler systemd[1]: Starting A-Tune Daemon...
-- Subject: A start job for unit atuned.service has begun execution
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit atuned.service has begun execution.
--
-- The job identifier is 946.
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="Connecting to DB: /var/lib/atuned/atuned.db" file="sqlstore.go:47"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="Connecting to DB: /var/lib/atuned/atuned.db" file="sqlstore.go:47"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="initializing service: monitor" file="atuned.go:123"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="initializing service: pyengine" file="atuned.go:123"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=fatal msg="failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory" fi>
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="initializing service: monitor" file="atuned.go:123"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=info msg="initializing service: pyengine" file="atuned.go:123"
Jul 30 09:12:03 openeuler atuned[2710]: time="2020-07-30T09:12:03+08:00" level=fatal msg="failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory" fi>
Jul 30 09:12:03 openeuler audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=atuned comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?>
Jul 30 09:12:03 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- An ExecStart= process belonging to unit atuned.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 1.
Jul 30 09:12:03 openeuler [2728]: [systemctl start atuned] return code=[1], execute failed by [root(uid=0)] from [pts/0 (119.3.119.18)]
Jul 30 09:12:03 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit atuned.service has entered the 'failed' state with result 'exit-code'.
Jul 30 09:12:03 openeuler systemd[1]: Failed to start A-Tune Daemon.
-- Subject: A start job for unit atuned.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit atuned.service has finished with a failure.
--
-- The job identifier is 946 and the job result is failed.
Jul 30 09:12:06 openeuler [2776]: [ps -ef | grep atuned] return code=[0], execute success by [root(uid=0)] from [pts/0 (119.3.119.18)]
Jul 30 09:13:28 openeuler systemd[1]: Configuration file /usr/lib/systemd/system/atuned.service is marked world-inaccessible. This has no effect as configuration data is accessible via>
Jul 30 09:13:28 openeuler [2823]: [systemctl status atuned.service] return code=[3], execute failed by [root(uid=0)] from [pts/0 (119.3.119.18)]

A-Tune版本:

[root@openeuler ~]# atuned --version
atuned version 0.1(5e8b8ec)

能否帮忙看一下是哪里的问题吗?谢谢!

@谢志鹏 请问那个分支是“dev”吗?我必须下载源码编译安装这个分支版本吗?

@woodrabbit ,是master分支

@woodrabbit ,是master分支

@谢志鹏 您好!非常感谢您的指导!我通过

git clone -b master https://gitee.com/openeuler/A-Tune.git

进行了安装,修改了配置文件/etc/atuned/atuned.cnf的几个选项:

disk = sda
network = enp189s0f0

成功运行了以下几个命令:

atune-adm list
atune-adm analysis

但是在我重启操作系统之后,

reboot

……

[root@openeuler A-Tune]# ps -ef | grep atune
root        3350       1  0 11:52 ?        00:00:02 /usr/bin/python3 /usr/libexec/atuned/analysis/app-engine.py /etc/atuned/atuned.cnf
root       14414   13782  0 15:53 pts/1    00:00:00 grep --color=auto atune

但是我查看状态:

[root@openeuler A-Tune]# systemctl status atuned
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2020-07-30 15:54:37 CST; 10s ago
  Process: 14946 ExecStart=/usr/bin/atuned (code=exited, status=1/FAILURE)
 Main PID: 14946 (code=exited, status=1/FAILURE)

Jul 30 15:54:37 openeuler systemd[1]: Starting A-Tune Daemon...
Jul 30 15:54:37 openeuler systemd[1]: atuned.service: Main process exited, code=exited, status=1/FAILURE
Jul 30 15:54:37 openeuler systemd[1]: atuned.service: Failed with result 'exit-code'.
Jul 30 15:54:37 openeuler systemd[1]: Failed to start A-Tune Daemon.

然后我试图重启atuned:

[root@openeuler A-Tune]# systemctl restart atuned
Job for atuned.service failed because the control process exited with error code.
See "systemctl status atuned.service" and "journalctl -xe" for details.

从以上信息看,atuned.service是否还是被该ECS的操作系统设置给disable了?

另外还有一些信息:

[root@openeuler A-Tune]# journalctl -xe
Jul 30 16:00:09 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:00:19 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:00:29 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:00:39 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:00:41 openeuler [15539]: [atuned-adme --version] return code=[127], execute failed by [root(uid=0)] from [pts/1 (119.3.119.18)]
Jul 30 16:00:48 openeuler [15556]: [atuned --version] return code=[0], execute success by [root(uid=0)] from [pts/1 (119.3.119.18)]
Jul 30 16:00:49 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:00:59 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:01 openeuler CROND[15570]: (root) CMD (run-parts /etc/cron.hourly)
Jul 30 16:01:01 openeuler systemd[1]: Starting dnf makecache...
-- Subject: A start job for unit dnf-makecache.service has begun execution
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit dnf-makecache.service has begun execution.
--
-- The job identifier is 3544.
Jul 30 16:01:01 openeuler run-parts[15574]: (/etc/cron.hourly) starting 0anacron
Jul 30 16:01:01 openeuler run-parts[15580]: (/etc/cron.hourly) finished 0anacron
Jul 30 16:01:01 openeuler dnf[15571]: Metadata cache refreshed recently.
Jul 30 16:01:01 openeuler systemd[1]: dnf-makecache.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit dnf-makecache.service has successfully entered the 'dead' state.
Jul 30 16:01:01 openeuler systemd[1]: Started dnf makecache.
-- Subject: A start job for unit dnf-makecache.service has finished successfully
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- A start job for unit dnf-makecache.service has finished successfully.
--
-- The job identifier is 3544.
Jul 30 16:01:01 openeuler audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter>
Jul 30 16:01:01 openeuler audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? term>
Jul 30 16:01:05 openeuler [15595]: [uname -a] return code=[0], execute success by [root(uid=0)] from [pts/1 (119.3.119.18)]
Jul 30 16:01:09 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:19 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:29 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:39 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:49 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:01:59 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:09 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:19 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:29 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:39 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:49 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:02:59 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:09 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:19 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:29 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:39 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:49 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:03:59 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:04:09 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0
Jul 30 16:04:19 openeuler /usr/sbin/irqbalance[679]: IRQ virtio0 (40) guessed as class 0

A-Tune配置文件信息:

[root@openeuler A-Tune]# cat /etc/atuned/atuned.cnf
# Copyright (c) 2019 Huawei Technologies Co., Ltd.
# A-Tune is licensed under the Mulan PSL v2.
# You can use this software according to the terms and conditions of the Mulan PSL v2.
# You may obtain a copy of Mulan PSL v2 at:
#     http://license.coscl.org.cn/MulanPSL2
# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR
# PURPOSE.
# See the Mulan PSL v2 for more details.
# Create: 2019-10-29

#################################### server ###############################
# atuned config
[server]
# the protocol grpc server running on
# ranges: unix or tcp
protocol = unix

# the address that the grpc server to bind to
# default is unix socket /var/run/atuned/atuned.sock
# ranges: /var/run/atuned/atuned.sock or ip address
address = /var/run/atuned/atuned.sock

# the atune nodes in cluster mode, separated by commas
# it is valid when protocol is tcp
# connect = ip01,ip02,ip03

# the atuned grpc listening port
# the port can be set between 0 to 65535 which not be used
# port = 60001

# the rest service listening port, default is 8383
# the port can be set between 0 to 65535 which not be used
rest_host = localhost
rest_port = 8383

# the tuning optimizer host and port, start by engine.service
# if engine_host is same as rest_host, two ports cannot be same
# the port can be set between 0 to 65535 which not be used
engine_host = localhost
engine_port = 3838

# when run analysis command, the numbers of collected data.
# default is 20
sample_num = 20

# interval for collecting data, default is 5s
interval = 5

# enable gRPC and http server authentication SSL/TLS
# default is false
# tls = true
# tlsservercertfile = /etc/atuned/server.pem
# tlsserverkeyfile = /etc/atuned/server.key
# tlshttpcertfile = /etc/atuned/http/server.pem
# tlshttpkeyfile = /etc/atuned/http/server.key
# tlshttpcacertfile = /etc/atuned/http/cacert.pem

#################################### log ###############################
[log]
# either "debug", "info", "warn", "error", "critical", default is "info"
level = info

#################################### monitor ###############################
[monitor]
# with the module and format of the MPI, the format is {module}_{purpose}
# the module is Either "mem", "net", "cpu", "storage"
# the purpose is "topo"
module = mem_topo, cpu_topo

#################################### system ###############################
# you can add arbitrary key-value here, just like key = value
# you can use the key in the profile
[system]
# the disk to be analysis
disk = vda

# the network to be analysis
network = eth0

user = root

我的atuned版本:

[root@openeuler A-Tune]# atuned --version
atuned version 0.2(09a4672)

系统信息:

[root@openeuler A-Tune]# atuned --version
atuned version 0.2(09a4672)
[root@openeuler A-Tune]# uname -a
Linux openeuler 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

谢谢!

@woodrabbit 你好,将/etc/atuned/atuned.cnf文件中的log标签页中的level设置为debug,然后systemctl restart atuned,再使用cat /var/log/messages |grep atune看下是否有具体的报错信息

@woodrabbit 你好,将/etc/atuned/atuned.cnf文件中的log标签页中的level设置为debug,然后systemctl restart atuned,再使用cat /var/log/messages |grep atune看下是否有具体的报错信息

@hanxinke 嗯嗯,好的!

failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory
创建下/var/run/atuned目录试下

failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory
创建下/var/run/atuned目录试下

@hanxinke 好的!

failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory
创建下/var/run/atuned目录试下

@hanxinke 可以了!!!

[root@openeuler ~]# systemctl restart atuned
[root@openeuler ~]# systemctl status atuned
● atuned.service - A-Tune Daemon
   Loaded: loaded (/usr/lib/systemd/system/atuned.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-07-30 17:32:55 CST; 17s ago
 Main PID: 3509 (atuned)
    Tasks: 13
   Memory: 173.8M
   CGroup: /system.slice/atuned.service
           ├─3509 /usr/bin/atuned
           └─3515 python3 /usr/libexec/atuned/analysis/app.py /etc/atuned/atuned.cnf

Jul 30 17:32:55 openeuler atuned[3509]: time="2020-07-30T17:32:55+08:00" level=info msg="pyservice has been started" file="atuned.go:195"
Jul 30 17:32:55 openeuler systemd[1]: Started A-Tune Daemon.
Jul 30 17:32:56 openeuler atuned[3515]: 2020-07-30 17:32:56,895 [INFO] flask.app[line:38] : {'module': 'MEM', 'purpose': 'TOPO', 'fmt': 'xml', 'path': '/usr/share/atuned/checker/mem_to>
Jul 30 17:32:58 openeuler atuned[3515]: 2020-07-30 17:32:58,399 [INFO] flask.app[line:38] : {'module': 'CPU', 'purpose': 'TOPO', 'fmt': 'xml', 'path': '/usr/share/atuned/checker/cpu_to>
Jul 30 17:32:58 openeuler atuned[3509]: Failure to load_config_data
Jul 30 17:32:58 openeuler atuned[3509]: time="2020-07-30T17:32:58+08:00" level=info msg="begin to restore profile id: 0" file="profile_rollback.go:96"
Jul 30 17:32:58 openeuler atuned[3509]: time="2020-07-30T17:32:58+08:00" level=info msg="begin to restore profile id: 0" file="profile_rollback.go:96"
Jul 30 17:32:58 openeuler atuned[3515]: 2020-07-30 17:32:58,483 [INFO] flask.app[line:35] : {'section': 'kernel_config', 'config': ''}
Jul 30 17:32:58 openeuler atuned[3515]: 2020-07-30 17:32:58,487 [INFO] flask.app[line:35] : {'section': 'bootloader.grub2', 'config': 'CPI_ROLLBACK_INFO = /usr/share/atuned/backup/defa>
Jul 30 17:32:58 openeuler atuned[3515]: 2020-07-30 17:32:58,492 [INFO] flask.app[line:35] : {'section': 'bootloader.grub2', 'config': 'CPI_ROLLBACK_INFO = /usr/share/atuned/backup/defa>
lines 1-20/20 (END)                                                                                           ```

failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory
创建下/var/run/atuned目录试下

@hanxinke 就是做了这个:

[root@openeuler ~]# mkdir -p /var/run/atuned

failed to listen: listen unix /var/run/atuned/atuned.sock: bind: no such file or directory
创建下/var/run/atuned目录试下

@hanxinke 刚才上载的调试信息很多,加载页面太慢,我已经删掉了哈!非常感谢您 :+1: :smile:

@hanxinke ,I think we need to fix it.

谢志鹏 负责人设置为hanxinke
hanxinke 通过openeuler/A-Tune Pull Request !118任务状态待办的 修改为已完成

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(4)
5329419 openeuler ci bot 1632792936
Go
1
https://gitee.com/openeuler/A-Tune.git
git@gitee.com:openeuler/A-Tune.git
openeuler
A-Tune
A-Tune

搜索帮助

14c37bed 8189591 565d56ea 8189591