nzbget 0.4.0 oom killed during par2 repair

Get help, report and discuss bugs.
forest

nzbget 0.4.0 oom killed during par2 repair

Post by forest » 14 Apr 2008, 16:12

I just upgraded to nzbget 0.4.0 on my linkstation, which is an arm9-based NAS. The machine has 128M of memory (about 20M free) and about 130M of swap (almost all free).

Downloading and decoding is working fine, but during the par2 repair phase, nzbget suddenly dies with a "Killed" message. I believe Linux' OOM (out of memory) killer is doing this.

I already have WriteBufferSize=1048576. How can I make nzbget use less memory during par2 repair?

hugbug

RE: nzbget 0.4.0 oom killed during par2 repair

Post by hugbug » 14 Apr 2008, 17:17

20M is more than enough.

The program was killed most likely because of segmentation fault (program error).

Does the problem occur on every par-check (i.e. is the problem reproducible)?

Have you compiled the program youself or do you use optware-repository (if so what optware-target is it)?

forest

RE: nzbget 0.4.0 oom killed during par2 repai

Post by forest » 17 Apr 2008, 15:50

So far, this verision has been killed on every par check.

I compiled it myself.

If it was segmentation fault problem, I'm not sure why it would be dying with the message "Killed" (which I have read is an OOM indicator on linux) instead of the usual "Segmentatin fault". I don't know for certain, though.

forest

RE: nzbget 0.4.0 oom killed during par2 repai

Post by forest » 17 Apr 2008, 16:14

Just to prove to myself what a segfault looks like on this machine, I wrote a program to force one:

$ cat >segfault.c
int main()
{
int *foo = 0;
*foo = 1;
return 0;
}
$ gcc -o segfault segfault.c
$ ./segfault
Segmentation fault

forest

RE: nzbget 0.4.0 oom killed during par2 repai

Post by forest » 17 Apr 2008, 16:20

And here's a test program designed to allocate and write to memory:

$ cat > oom.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MEGABYTE 1024*1024

int main(int argc, char *argv[])
{
void *myblock = NULL;
int count = 0;

while(1)
{
myblock = (void *) malloc(MEGABYTE);
if (!myblock) break;
memset(myblock,1, MEGABYTE);
printf("Currently allocating %d MBn",++count);
}
exit(0);

}
$ gcc -o oom oom.c
$ $ ./oom
Currently allocating 1 MB
Currently allocating 2 MB
Currently allocating 3 MB
[...]
Currently allocating 212 MB
Currently allocating 213 MB
Killed


That's exactly the message I see after nzbget dies during the par2 phase.

hugbug

RE: nzbget 0.4.0 oom killed during par2 repair

Post by hugbug » 17 Apr 2008, 17:34

1. I use nzbget on asus router with only 32MB. Swap file is never exceeds 15 MB.

2. There is no way to adjust memory usage during parcheck.

3. Can you please test with a small nzb-file and monitor the memory usage during par-check.

4. What system do you use, is it optware or debian? If optware, what target (platform)?

5. Can you please try to compile par2 (par2cmdline) and test it (libpar is based on par2cmdline).

6. The compiler option "-fno-stack-protector" might help.

forest

RE: nzbget 0.4.0 oom killed during par2 repai

Post by forest » 18 Apr 2008, 06:09

1. That's comforting. We should be able to make this work.

2. Okay.

3. I tested with a small nzb-file, about 5 megs of files. It was killed again. I left the files in nzbget's queue. When I restart it, nzbget loads and begins verifying the par2 files, and is killed again in 4.5 seconds. How do you suggest I collect memory usage for a process that ends so quickly?

4. I use debian etch on my linkstation. The FreeLink_arm9-1.0rev2 distribution. Kernel version 2.6.12.6-arm1.

5. I have been using the par2 command line program from the debian par2 package, and it works fine. I just downloaded par2cmdline from parchive.sourceforge.net, but the build died with errors like this:
reedsolomon.cpp:54: error: explicit specialization of

forest

RE: nzbget 0.4.0 oom killed during par2 repai

Post by forest » 18 Apr 2008, 06:24

Okay, this is interesting...

I didn't know nzbget was creating child processes, but apparently it is, using the clone() system call. (Could these be threads implemented as lightweight processes?) I ran it with strace -f, which revealed a SIGSEGV in one of the child processes. (Maybe that's why it didn't look like a segfault before.)

<...>
[pid 16819] clone(Process 16824 attached
child_stack=0xbdfffbd8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 16824
<...>
[pid 16824] write(6, "Thu Apr 17 23:13:43 2008tINFOtRe"..., 67) = 67
[pid 16824] close(6) = 0
[pid 16824] munmap(0x40606000, 4096) = 0
[pid 16824] time(NULL) = 1208499223
[pid 16824] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 16824 detached
[pid 16820] <... accept resumed> 0xbe7ffa64, [16]) = ? ERESTARTSYS (To be restarted)
[pid 16819] <... poll resumed> [{fd=3, events=POLLIN}], 1, 2000) = -1 EINTR (Interrupted system call)
[pid 16821] <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
[pid 16818] <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
[pid 16822] <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
[pid 16823] <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
[pid 16820] +++ killed by SIGKILL +++
Process 16820 detached
[pid 16819] +++ killed by SIGKILL +++
Process 16819 detached
[pid 16821] +++ killed by SIGKILL +++
Process 16821 detached
[pid 16818] +++ killed by SIGKILL +++
Process 16818 detached
[pid 16822] +++ killed by SIGKILL +++
Process 16822 detached
+++ killed by SIGKILL +++
Process 16823 detached

hugbug

RE: nzbget 0.4.0 oom killed during par2 repair

Post by hugbug » 18 Apr 2008, 06:50

1. nzbget uses pthreads (linux threads). They may look as child processes, that's OK.

2. Please compile the program with debug enabled (./configure "--enable-debug"), then activate all log-targets in config-file (debugtarget=both, detailtarget=both, etc.). It now should write a lot of debug info to log-file. Please send the file to me.

3. >I can try -fno-stack-protector once I get to the point where it will
>build at all. Any suggestions?

I mean, try the option with nzbget, not par2. It was mentioned on this forum, that it helped on nslu2 (also ARM) running debian.

4. Are you able to compile libpar2?

nobody

RE: nzbget 0.4.0 oom killed during par2 repai

Post by nobody » 19 Apr 2008, 00:54

Re 5: In reedsolomon.cpp, replace the six instances of "bool ReedSolomon" with "template <> bool ReedSolomon".

Post Reply

Who is online

Users browsing this forum: No registered users and 45 guests