Backup fails on cron job (also apt-get update). [#38764]

Submitted by LoopZilla on Sat, 11/28/2015 - 08:15

I have a backup script that runs daily. Each day is a different script.

On Sunday, I run /root/scripts/backup0.sh which is:

# /usr/sbin/virtualmin backup-domain --dest /root/backups0/ --all-domains --all-features --newformat --all-virtualmin

On Monday, /root/backups1/ and so on.

Here is the crontab entry:

1 1 * * 0 /root/scripts/backup0.sh 1 1 * * 1 /root/scripts/backup1.sh 1 1 * * 2 /root/scripts/backup2.sh 1 1 * * 3 /root/scripts/backup3.sh 1 1 * * 4 /root/scripts/backup4.sh 1 1 * * 5 /root/scripts/backup5.sh 1 1 * * 6 /root/scripts/backup6.sh

This used to work well on Debian 7, but very poorly on Debian 8. I have to run the scripts manually.

The backups tend to fail to errors every night, often the gzip of databases.

Where should I look for errors? Are the errors logged somewhere?

I also get some strange errors with the apt-get updates, again not in Debian 7 (that is from the daily cron job). Could this be related?

( apt-get -qq update && apt-get -qqy upgrade -u ) 2>&1 | tee -a /var/log/auto-upgrade.log

debconf: unable to initialize frontend: Dialog debconf: (TERM is not set, so the dialog frontend is not usable.) debconf: falling back to frontend: Readline debconf: unable to initialize frontend: Readline debconf: (This frontend requires a controlling tty.) debconf: falling back to frontend: Teletype dpkg-preconfigure: unable to re-open stdin: dpkg: warning: 'ldconfig' not found in PATH or not executable dpkg: warning: 'start-stop-daemon' not found in PATH or not executable dpkg: error: 2 expected programs not found in PATH or not executable Note: root's PATH should usually contain /usr/local/sbin, /usr/sbin and /sbin E: Sub-process /usr/bin/dpkg returned an error code (2)

I rebuilt from a clean Debian 8, adding Virtulamin and then the domains.

I switched from mysql to MariaDB.

Status:

Closed (fixed)

Comments

Submitted by andreychek on Sat, 11/28/2015 - 09:53 Comment #1

Howdy -- it looks like you're seeing two different issues there.

Regarding the backups, do you receive error output via email when the backups run?

You could also see the same backup output by going into Backup and Restore -> Backup Logs.

That should show you what error is occurring when the backups are running.

Submitted by LoopZilla on Sat, 11/28/2015 - 16:26 Comment #2

Regarding the backups, do you receive error output via email when the backups run?

Yes, I do. It just stops before the end.

You could also see the same backup output by going into Backup and Restore -> Backup Logs.

A mixed picture! For example, backups6 (today) is one fail (crontab) and one success (run from the command line.

Showing logs from last 7 days .. Destination Backup contents Made by Started Run time Final size OK?
local file /root/backups6 29 domains root 28/Nov/2015 13:14 09:04 2.06 GB Yes local file /root/backups6 29 domains root 28/Nov/2015 01:01 00:08 0 bytes No local file /root/backups5 29 domains root 27/Nov/2015 10:22 09:00 2.05 GB Yes local file /root/backups4 29 domains root 26/Nov/2015 09:55 08:59 2.08 GB Yes local file /root/backups4 29 domains root 26/Nov/2015 09:48 01:40 0 bytes No local file /root/backups4 29 domains root 26/Nov/2015 09:42 01:37 0 bytes No local file /root/backups4 29 domains root 26/Nov/2015 01:01 01:51 0 bytes No local file /root/backups3 28 domains root 25/Nov/2015 09:52 09:01 2.07 GB Yes local file /root/backups3 28 domains root 25/Nov/2015 01:01 00:13 0 bytes No local file /root/backups2 28 domains root 24/Nov/2015 10:22 09:28 2.07 GB Yes local file /root/backups2 28 domains root 24/Nov/2015 01:01 01:51 0 bytes No local file /root/backups1 28 domains root 23/Nov/2015 12:11 08:56 2.07 GB Yes local file /root/backups1 28 domains root 23/Nov/2015 10:35 09:14 2.07 GB Yes local file /root/backups0 28 domains root 22/Nov/2015 11:17 08:31 2.07 GB Yes local file /root/backups0 28 domains root 22/Nov/2015 10:29 08:29 2.07 GB Yes

The web interface gives me no more information that the email reports.

Submitted by LoopZilla on Sat, 11/28/2015 - 16:28 Comment #3

Showing logs from last 7 days ..

Destination Backup contents Made by Started Run time Final size OK?

local file /root/backups6 29 domains root 28/Nov/2015 13:14 09:04 2.06 GB Yes

local file /root/backups6 29 domains root 28/Nov/2015 01:01 00:08 0 bytes No

local file /root/backups5 29 domains root 27/Nov/2015 10:22 09:00 2.05 GB Yes

local file /root/backups4 29 domains root 26/Nov/2015 09:55 08:59 2.08 GB Yes

local file /root/backups4 29 domains root 26/Nov/2015 09:48 01:40 0 bytes No

local file /root/backups4 29 domains root 26/Nov/2015 09:42 01:37 0 bytes No

local file /root/backups4 29 domains root 26/Nov/2015 01:01 01:51 0 bytes No

local file /root/backups3 28 domains root 25/Nov/2015 09:52 09:01 2.07 GB Yes

local file /root/backups3 28 domains root 25/Nov/2015 01:01 00:13 0 bytes No

local file /root/backups2 28 domains root 24/Nov/2015 10:22 09:28 2.07 GB Yes

local file /root/backups2 28 domains root 24/Nov/2015 01:01 01:51 0 bytes No

local file /root/backups1 28 domains root 23/Nov/2015 12:11 08:56 2.07 GB Yes

local file /root/backups1 28 domains root 23/Nov/2015 10:35 09:14 2.07 GB Yes

local file /root/backups0 28 domains root 22/Nov/2015 11:17 08:31 2.07 GB Yes

local file /root/backups0 28 domains root 22/Nov/2015 10:29 08:29 2.07 GB Yes

Submitted by LoopZilla on Sat, 11/28/2015 - 16:29 Comment #4

A typical error message is:

Dumping MySQL database dummy ..
.. gzip failed!

In this case, the database was empty.

Submitted by andreychek on Sat, 11/28/2015 - 16:50 Comment #5

Is this a VPS, or dedicated server?

Also, what is the output of the commands "free -m" and "df -h"?

If you have a /proc/user_beancounters file, can you paste in it's contents?

Lastly, the domain in question -- does it have plenty of quota space available?

Submitted by LoopZilla on Sun, 11/29/2015 - 05:56 Comment #6

VPS.

free -m

total used free shared buffers cached

Mem: 1996 1980 15 200 102 871

-/+ buffers/cache: 1006 990

Swap: 0 0 0

free -m

total used free shared buffers cached

Mem: 1996 1833 163 214 228 637

-/+ buffers/cache: 967 1029

Swap: 0 0 0

Swap is switched off.

df -h

Filesystem Size Used Avail Use% Mounted on

/dev/root 89G 29G 56G 35% /

devtmpfs 998M 0 998M 0% /dev

tmpfs 999M 0 999M 0% /dev/shm

tmpfs 999M 9.5M 989M 1% /run

tmpfs 5.0M 4.0K 5.0M 1% /run/lock

tmpfs 999M 0 999M 0% /sys/fs/cgroup

tmpfs 200M 0 200M 0% /run/user/0

file /proc/user_beancounters /proc/user_beancounters: cannot open `/proc/user_beancounters' (No such file or directory)

I did look at quotas. Since I have a large diskspace (at 35% usage I have switched quotas for all domains). Also, different domains fail each time, not just the first.

Should I switch on quotas?

But on reflection the RAM does look like it could be an issue.

Total 1996 Used 1980 Available 15

Perhaps does not look good? Should I create a temp space?

But the script works in a interactive shell (foreground or background). My question is why should it fail as a cron job?

Submitted by andreychek on Sun, 11/29/2015 - 09:47 Comment #7

If quotas are off, those definitely wouldn't be the problem... you can keep those off while troubleshooting.

Enabling swap space could potentially help though.

What kind of VPS is it you're using there?

Also, what is the output of this command:

dmesg | tail -50

Submitted by LoopZilla on Sun, 11/29/2015 - 09:54 Comment #8

Enabling swap space could potentially help though.

Forbidden by my ISP.

What kind of VPS is it you're using there?

A Miniserver.

http://www.memset.com/dedicated-servers/vps/

uname -a

Linux rxxxxxxxx.miniserver.com 3.18.12-x1-64 #1 SMP Tue Apr 28 11:34:18 GMT 2015 x86_64 GNU/Linux

Also, what is the output of this command....

dmesg | tail -50

[ 2.690672] systemd[1]: Created slice User and Session Slice.

[ 2.690686] systemd[1]: Starting /dev/initctl Compatibility Named Pipe.

[ 2.690763] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.

[ 2.690778] systemd[1]: Starting Delayed Shutdown Socket.

[ 2.690836] systemd[1]: Listening on Delayed Shutdown Socket.

[ 2.690851] systemd[1]: Starting Journal Socket (/dev/log).

[ 2.690925] systemd[1]: Listening on Journal Socket (/dev/log).

[ 2.690946] systemd[1]: Starting udev Control Socket.

[ 2.691008] systemd[1]: Listening on udev Control Socket.

[ 2.691027] systemd[1]: Starting udev Kernel Socket.

[ 2.691078] systemd[1]: Listening on udev Kernel Socket.

[ 2.691093] systemd[1]: Starting Journal Socket.

[ 2.691168] systemd[1]: Listening on Journal Socket.

[ 2.691189] systemd[1]: Starting System Slice.

[ 2.691255] systemd[1]: Created slice System Slice.

[ 2.691302] systemd[1]: Starting File System Check on Root Device...

[ 2.692354] systemd[1]: Starting system-getty.slice.

[ 2.692463] systemd[1]: Created slice system-getty.slice.

[ 2.692493] systemd[1]: Starting system-serial\x2dgetty.slice.

[ 2.692575] systemd[1]: Created slice system-serial\x2dgetty.slice.

[ 2.692634] systemd[1]: Starting Increase datagram queue length...

[ 2.693639] systemd[1]: Mounting POSIX Message Queue File System...

[ 2.694645] systemd[1]: Mounted Debug File System.

[ 2.694694] systemd[1]: Starting udev Coldplug all Devices...

[ 2.891293] systemd[1]: Mounted Huge Pages File System.

[ 2.942509] systemd[1]: Started Create list of required static device nodes for the current kernel.

[ 2.942610] systemd[1]: Starting Create Static Device Nodes in /dev...

[ 2.944146] systemd[1]: Started Set Up Additional Binary Formats.

[ 2.944977] systemd[1]: Starting Load Kernel Modules...

[ 2.946017] systemd[1]: Starting Slices.

[ 2.946060] systemd[1]: Reached target Slices.

[ 2.974461] systemd[1]: Started Load Kernel Modules.

[ 2.976042] systemd[1]: Mounting Configuration File System...

[ 2.977092] systemd[1]: Starting Apply Kernel Variables...

[ 2.978299] systemd[1]: Mounting FUSE Control File System...

[ 2.984267] systemd[1]: Started Apply Kernel Variables.

[ 2.993200] systemd[1]: Started udev Coldplug all Devices.

[ 3.010261] systemd[1]: Mounted FUSE Control File System.

[ 3.010332] systemd[1]: Mounted Configuration File System.

[ 3.010380] systemd[1]: Mounted POSIX Message Queue File System.

[ 3.010709] systemd[1]: Started Increase datagram queue length.

[ 3.015622] systemd[1]: Starting Syslog Socket.

[ 3.015701] systemd[1]: Listening on Syslog Socket.

[ 3.015720] systemd[1]: Starting Journal Service...

[ 3.016908] systemd[1]: Started Journal Service.

[ 3.116815] systemd-udevd[1929]: starting version 215

[ 3.495646] EXT3-fs (xvda1): using internal journal

[ 3.740781] systemd-journald[1927]: Received request to flush runtime journal from PID 1

[ 5.584526] random: nonblocking pool is initialized

[ 156.858994] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.

Submitted by LoopZilla on Sun, 11/29/2015 - 09:59 Comment #9

Did you have an idea why the apt-get also fails as a crontab, but works as a shell command?

Here is the email body from the cron job:

debconf: unable to initialize frontend: Dialog

debconf: (TERM is not set, so the dialog frontend is not usable.)

debconf: falling back to frontend: Readline

debconf: unable to initialize frontend: Readline

debconf: (This frontend requires a controlling tty.)

debconf: falling back to frontend: Teletype

dpkg-preconfigure: unable to re-open stdin:

dpkg: warning: 'ldconfig' not found in PATH or not executable

dpkg: warning: 'start-stop-daemon' not found in PATH or not executable

dpkg: error: 2 expected programs not found in PATH or not executable

Note: root's PATH should usually contain /usr/local/sbin, /usr/sbin and /sbin

E: Sub-process /usr/bin/dpkg returned an error code (2)

Submitted by andreychek on Sun, 11/29/2015 - 10:38 Comment #10

Those are most likely unrelated issues.

The backup issues you are seeing are probably resource related.

You may be running out of RAM, disk space, or some related kind of resource during the backup process.

The apt-get issue appears to be a server configuration issue, where for some reason some files are not in root's path. The path used in cron can be different from the path used when logged in on the command line.

That may be an issue with the image your provider setup, or also possibly due to configuration changes made after that.

You can see a few examples of how to troubleshoot and correct that by Googling this phrase (from your error above):

'ldconfig' not found in PATH or not executable

As far as your backups go though -- are there any services running that you can stop, in order to free up more RAM?

Also, what kind of backup format are you using? You can determine that by looking in the Backup Schedule for your backup. It's in the "Destination and format" section of that screen.

Submitted by andreychek on Sun, 11/29/2015 - 10:39 Comment #11

Also, in addition to the above questions -- which Virtualmin version is it that you're using there?

Submitted by LoopZilla on Mon, 11/30/2015 - 09:18 Comment #12

Currently 4.18.gpl on Debian 8.

I moved over from 4.09gpl on Debian 7 a few weeks ago, and that's when the problems started.

Submitted by LoopZilla on Mon, 11/30/2015 - 09:21 Comment #13

As far as your backups go though -- are there any services running that you can stop, in order to free up more RAM?

Yes, I could try that. I have considered increasing the RAM.

Also, what kind of backup format are you using? You can determine that by looking in the Backup Schedule for your backup. It's in the "Destination and format" section of that screen.

I am using a shell script. I have also try using the service in the control panel.

My shell script is

/usr/sbin/virtualmin backup-domain --dest /root/backups0/ --all-domains --all-features --newformat --all-virtualmin

Submitted by andreychek on Mon, 11/30/2015 - 09:49 Comment #14

I absolutely believe you that it happened at the same time... however, it's unlikely a Virtualmin-specific issue. It could be that the package upgrades at the time performed a number of updates that increased RAM usage, or change some other software behavior.

We have seen that particular error a handful of times over the last few years, and it's generally been due to resources running out during the backup process.

However, the particular error that's seen there isn't very helpful :-/

I'm going to talk to Jamie to see if there's a way to get more information on why exactly the gzip is failing, and I'll get back to you on that.

Submitted by LoopZilla on Tue, 12/01/2015 - 03:42 Comment #15

Thanks. I agree that it might be outside Virtualmin. But I have some errors from "web-lib-funcs.pl"

Here is another error from today's backup.

Creating backup for virtual server xxx.com .. Copying virtual server configuration .. Failed to open /home/xxx/.backup/xxx.com_virtualmin_scripts for writing : No such file or directory at ../web-lib-funcs.pl line 1397.

Also, when the system run /etc/webmin/status/monitor.pl I sometimes get this:

postfix::is_postfix_running failed : Failed to query Postfix config command to get the current value of parameter process_id_directory: at ../web-lib-funcs.pl line 1397.

This is also a cronjob, set up by the Virtualmin installation:-

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /etc/webmin/status/monitor.pl

find /usr/ -name web-lib-funcs.pl -print

/usr/share/webmin/web-lib-funcs.pl /usr/share/usermin/web-lib-funcs.pl

I think the code around line 1397 is this:

1388 &load_theme_library();

1389 if ($main::error_must_die) {

1390 if ($gconfig{'error_stack'}) {

1391 print STDERR "Error: ",$msg,"\n";

1392 for(my $i=0; my @stack = caller($i); $i++) {

1393 print STDERR "File: $stack[1] Line: $stack[2] ",

1394 "Function: $stack[3]\n";

1395 }

1396 }

1397 die @_;

1398 }

1399 elsif (!$ENV{'REQUEST_METHOD'}) {

Thanks for your help.

Submitted by LoopZilla on Tue, 12/01/2015 - 03:48 Comment #16

I am going to change the time of the cron job. If there is another job running, then this might lead to a drain on resources (RAM memory). I have just run the cron job again (local time around 9am) and it worked.

Submitted by andreychek on Tue, 12/01/2015 - 09:29 Comment #17

In the case of the backup error you received, does the "/home/xxx/" directory exist? That should be the user's home directory.

Also, I think you mentioned that quotas were disabled, but I just wanted to double-check that... if quotas were enabled for this domain, and it's quota was reached, that could explain the issue you're seeing.

Submitted by LoopZilla on Wed, 12/02/2015 - 04:02 Comment #18

In the case of the backup error you received, does the "/home/xxx/" directory exist? That should be the user's home directory.

Correct. It exists.

Also, I think you mentioned that quotas were disabled, but I just wanted to double-check that... if quotas were enabled for this domain, and it's quota was reached, that could explain the issue you're seeing.

OK. I will check that.

LZ.

Submitted by LoopZilla on Sat, 12/12/2015 - 03:22 Comment #19

Problem solved. I believe it was a load or memory issue, so I shut Apache down, carry out the backup and start Apache up again.

Submitted by LoopZilla on Sat, 12/12/2015 - 03:22 Comment #20

Submitted by andreychek on Sat, 12/12/2015 - 08:44 Comment #21

Great, I'm glad you came up with a way to get the backups running smoothly!