Topic: Kernel 2.6.30-rc7 (generic *.deb packaging)

Today' effort (-09144, yyddd) is up at http://hp-umpc.com/ce1200v as:
NOTE: You have to type or cut&paste that link - it doesn't "click" - that is intentional.

2.6.30-rc7-ce1200v-09144lk.md5
2.6.30-rc7-ce1200v-09144.md5
config-2.6.30-rc7-ce1200v-09144
config-2.6.30-rc7-ce1200v-09144lk
linux-2.6.30-rc7-ce1200v-09144_2.6.30-rc7-ce1200v-09144-25_i386.deb
linux-2.6.30-rc7-ce1200v-09144lk_2.6.30-rc7-ce1200v-09144lk-26_i386.deb
linux-firmware-image_2.6.30-rc7-ce1200v-09144-25_all.deb
linux-firmware-image_2.6.30-rc7-ce1200v-09144lk-26_all.deb

This is the identical configuration tested yesterday, only built against the 2.6.30-rc7 tagged code base.
Yesterday's testing of the -09143{,lk} pair described here:
http://forum.netbookuser.com/viewtopic. … 6968#p6968

For anyone losing track or joining the party late. . .

My /etc/initramfs-tools/modules looks like:

# List of modules that you want to include in your initramfs.
#
# Syntax:  module_name [args ...]
#
# You must run update-initramfs(8) to effect this change.
#
ehci_hcd
uhci_hcd
rtl8187

That is required so the USB drivers get loaded in the correct order, before
the distribution scripting gets a chance to do it wrong.  wink

My /etc/modules file looks like:

# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
#
hangcheck-timer hangcheck_reboot=1 hangcheck_dump_tasks=1 hangcheck_tick=20 hangcheck_margin=10

You can also add: padlock-aes and padlock-sha (they seem to be safe but are modules anyway).

My /boot/grub/menu.lst looks like this (in relevant part):

#
# Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST

title           Linus-2.6.30 - kernel 2.6.30-rc7-ce1200v-09144lk
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.30-rc7-ce1200v-09144lk root=/dev/sda1 resume=/dev/sda2 ro quiet splash
initrd          /boot/initrd.img-2.6.30-rc7-ce1200v-09144lk
quiet

title           Linux-2.6.30 - kernel 2.6.30-rc7-ce1200v-09144
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.30-rc7-ce1200v-09144 root=/dev/sda1 resume=/dev/sda2 ro quiet splash
initrd          /boot/initrd.img-2.6.30-rc7-ce1200v-09144
quiet

title           Linux-2.6.30 - kernel 2.6.30-rc6-ce1200v-09143lk
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.30-rc6-ce1200v-09143lk root=/dev/sda1 resume=/dev/sda2 ro quiet splash
initrd          /boot/initrd.img-2.6.30-rc6-ce1200v-09143lk
quiet

### BEGIN AUTOMAGIC KERNELS LIST

Yesterday's -09143lk build has been promoted to my 'rescue' kernel - a known 'best to date' point.

Still using the Kbuild deb-pkg target and the packaging notes here:
http://forum.netbookuser.com/viewtopic. … 6841#p6841

= = = =

Classic Rock today -
The machine has been up 45 minutes (-09144lk) while getting these posts organized.
That beats the -09143 uptime of yesterday.

Why music?
The wired NIC is on the other side of the PCI-to-PCI bridge -
The HD audio is on the other side of the PCI-to-PCIe bridge -
Playing music that my ear knows well does not distract me, while the ear will pick up
any changes in machine behavior without any special attention.

= = = =

Now back to my writing a commercial grade, kernel/printk.c

Edit
The -09144lk on Cloudbook (C7-M/CX700) how up 4 1/2 hours *with* USB working.
A new, double, record.

Edit 1:
A correction to the above - the ehci-hcd driver *is* into its re-try loop with message flood
mode - it just hasn't taken out the mouse, the kernel, or killed itself (yet).

Last edited by mikez (2009-05-24 1:27:49 pm)

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

That's all folks - a new record of 4 hours and 42 minutes.
- - - -
The difference - a 4C degree difference in ambient temperature.
- - - -
How?
For silicon devices, the mobility (at the molecular/ion level) doubles
for each 10C degree increase in temperature.
A 4C degree lower ambient temperature means *about* a 4C degree
decrease in chip temperature.
This is also the background to why a 10C degree increase in your silicon
device halves its life span.  wink
- - - -
Why?
The echi-hcd driver gets kicked into its re-try loop by a (false) over-current
detection in the integrated hub.
Without data on the chip set internals, I can only guess the over-current sensor
is sensitive to both temperature and current.
Since they normally are. wink  Also, they are normally thermal compensated to
eliminate the temperature effect.

Does this mean I am going over-temperature the CX700?
Or
The thermal compensation is not correct on the CX700?
Or
The USB driver is mis-detecting a normal (but extreme) reading?

Can't say - not until I get the SMBus driver working so we can read
the CX700 temperatures and voltage.

Meanwhile - I am intentionally *not fixing* the USB driver - I will need
it to test the new nprintk.c under failure conditions.  wink

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Keep up the hard work MikeZ there are lurkers very interested in your kernel! cool

Slade.

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Today's effort is posted at the usual place as:

2.6.30-rc7-ce1200v-09145-db.md5
2.6.30-rc7-ce1200v-09145lk-db.md5
config-2.6.30-rc7-ce1200v-09145-db
config-2.6.30-rc7-ce1200v-09145lk-db
linux-2.6.30-rc7-ce1200v-09145-db_2.6.30-rc7-ce1200v-09145-db-27_i386.deb
linux-2.6.30-rc7-ce1200v-09145lk-db_2.6.30-rc7-ce1200v-09145lk-db-28_i386.deb
linux-firmware-image_2.6.30-rc7-ce1200v-09145-db-27_all.deb
linux-firmware-image_2.6.30-rc7-ce1200v-09145lk-db-28_all.deb

This build is the same code base as yesterday's (-09144{,lk}) pair with the addition of
the kernel's lock dependency checking enabled.
Do not be alarmed at slow boot times - that is the lockdep code running self-tests.  Wait.
You can see the results of the testing in dmesg.

Also, keep in mind I am no longer building in the padlock drivers (via-rng is built-in).
You probably need to manual load them (depends on your distro auto-loading setup):

root@cb01:~# modprobe padlock-aes
root@cb01:~# modprobe padlock-sha
root@cb01:~# openssl engine
(padlock) VIA PadLock (no-RNG, ACE)
(dynamic) Dynamic engine loading support

= = = =

I have heard from the author of the SMBus driver - it **is expected to work on CX700**
Will have to give it another try RSN.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

MikeZ-

Better take a look at your webserver hosting the kernels.. it's running pretty sluggish.

EDIT: thanks for SIGHUPping the webserver.

Last edited by aastaneh (2009-05-25 12:10:25 pm)

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Not me!  I was sound asleep the entire time.
The last few builds where announced on LKML in answer to "not providing enough information".
It could be people wget'ting the whole directory, or it could be just a Dreamhost slowdown.
Will take a look at the logs later tonight.  Just glad that the problem fixed itself.
= = = =
There is a driver for the hardware watchdog, it is still "in the shop" at VIA.  Maybe soon.
Same with more technical manuals - they are still on the "release path" at VIA.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Stumbled on a "wrong-code" generation error in GCC-4.3 - - GCC-4.1.2 gets it correct.

Although done on only a hunch - switching to 4.1.2 for these kernels a few weeks back
was the "right thing to do" - - wink

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Rhetorical question:  How deep will this sh.. become?

Edit:
I should stop my complaining, the output of:
gcc -O2 -S -fomit-frame-pointer ...
Is hard to beat (if the compiler version gets it right, and so far, 4.1.2 does);
even by someone who has been hand coding Intel code since the I4004.
<< I had to brag, just to get myself out of yesterday's funk.>>

Of course, GCC only generates enough of the instruction set to translate source programs.
So hand coding is not completely out of style (although a dying art).

Last edited by mikez (2009-05-26 5:47:42 am)

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

mikez wrote:

Stumbled on a "wrong-code" generation error in GCC-4.3 - - GCC-4.1.2 gets it correct.

Although done on only a hunch - switching to 4.1.2 for these kernels a few weeks back
was the "right thing to do" - - wink

Incorrect
It is the magic decoder in objdump that lists the instruction for 0x8d, 0x04, 0x11 incorrectly.
gcc codes it right, objdump -d displays it wrong.  Groan.
Incorrect 2:
It isn't *wrong*, just an alternative listing not expected by this reader - - I misread it.
Still, I would rather inspect what is being generated to check myself on what I write.
Even on the chance that I will mis-read what I am checking.  big_smile
My goal is a bullet-proof message writer - all else can turn to c..., but I want to see the message.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Still here, still on the job.  Although my time is spent on the re-designed printk.c - -
The general problem has been stirring around in the back of my mind -
= = = =
There is not a lot of information to be gained from a glowing power-on light and a locked machine, but. . .
*) The difference between the up-time of a kernel using the 'lock' opcode in critical areas is about and
order of magnitude; _consistent_
*) Adding the lockdep reporting (and structure changes required) halves the up-time; _consistent_
*) The difference between the system chip sets CX700 and CN896 is significant (no failures on CN896).
= = = =
This sounds very much like a cache coherency or structure alignment on a cache line problem.
The cache is on the processor chip - same in both machines -
The devices that modify memory are on the system chip - different implementations (CX700/CN896) -
Also, half of what makes up the cache coherency controls is on the system chip (the *I/O* making
a change to a memory address has to "know" if the cache line and the memory line are consistent).
= = = =
The above line of thought does not rule out software errors, such as "out of bounds" write errors
(demonstrated by changing the ring-buffer size) or the human error of not using the correct instruction
sequence somewhere (demonstrated by globally changing the LOCK_PREFIX byte).
= = = =
I would be very surprised if the Linux kernel *was not* reporting the problem - - -
It is just a matter of making a bullet-proof message writer so we can read the notice.  wink

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

While browsing last night's LKML postings - in an un-related patch:

        /*
         * Assume PCI cacheline size of 32 bytes for all x86s except K7/K8
         * and P4. It's also good for 386/486s (which actually have 16)
         * as quite a few PCI devices do not support smaller values.
         */
+
        pci_cache_line_size = 32 >> 2;
        if (c->x86 >= 6 && c->x86_vendor == X86_VENDOR_AMD)
                pci_cache_line_size = 64 >> 2;  /* K7 & K8 */
        else if (c->x86 > 6 && c->x86_vendor == X86_VENDOR_INTEL)
                pci_cache_line_size = 128 >> 2; /* P4 */
+}

Where is the (c->x86 >= 6 && c->x86_vendor == X86_VENDOR_VIA) ?
What is the pci_cache_line_size in the CX700?  In the CN896?
Duh....

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Unknown.
I hard coded the pci_cache_line_size to the processor's cache line size - -
Will see what difference that makes - -
Other than making the machine boot much faster and the music louder. wink
= = = =
It also looks like our "too slow" disk drive needs to be put on the exception
list for udma/66 so the driver will stop running it at udma/33.  wink

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Wise choice, wise choice. We can't manipulate UDMA parameters using hdparm so that is the best way to kick up performance a notch granted the chipset and the drive support it. As for the cache line size, who knows.

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Question is still pending to VIA-CPU engineering. . .
Have heard back from VIA-CPU engineering on the LOCK_PREFIX usage, quote: "--- use it ---".
= = = =
They (VIA) reports doing some experiments with disabling 'lock' in the cpu - -
doing so breaks a popular, proprietary, operating system.
= = = =
Today's 09147lk-db has just passed its previous record of 2 hours (the -09145lk-db) - - -
Will see how this goes....
= = = =
Got this post typed without it dying, it usually dies when I brag on it. tongue

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Todays debug build is now +50% its own prior record;
Had the USB-2.0 driver fail again;
Got a lock-dep report on lock conflicts out of it (posted to LKML);
The music plays on - without the external mouse of course.  wink

= = = =

Now will rebuild this code base without the lockdep, put back the VIA stuff (e_powersaver, i2c-viapro) -
see if the use of 'lock' and the changed cache line size is the "magic bullet" for 2.6.30.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Build with the 'lock' and the changed cache line size against today's repository head is looking good - -
This is a _variable speed_ kernel with e_powersaver - i2c-viapro also loads - not sure it is working.
= = = =
Will probably build a -09148lkcs tomorrow for posting.  Machine is still running, music playing.
= = = =

Historical note:
A number of years ago, in the HP, PARISC Linux port - - some experimental kernels where built
that forced the linker to place all of the *_lock_* thingies into their own, cache line aligned, section.
Where their access could not in-directly affect other cached data - -
I don't know if that ever made it into the x86 kernel (or even into the parisc kernel) - -
Will check on that for tomorrow's build.
- - - -
Besides, we might get VIA Tech's recommendation on this RSN.
- - - -
At my end of the spectrum (user) and the VIA Tech's engineering department are
of a like mind over the subject of using the 'lock' prefix - members of LKML have a
scattered collection of opinions.
As long as we get an order of magnitude improvement in up-time - - it stays in!

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Hmm...
The USB-2.0 driver did its thing - I did my thing to make it go away - losing the external mouse - - -
The music played on - - -
Half an hour later - the music started to stutter - so I stopped VLC and tried watching Hulu - - -
That cost me the touchpad and keyboard after a few minutes -
But
I was still able to restart the machine from the ssh remote terminal session -
= = = =
Translation:  Kernel (almost) fixed - now need to deal with driver problems.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Great! You're making progress!

Try running this against the new kernel with the udma patch and the one without.. I'm curious about the difference:

hdparm -Tt /dev/sda

Last edited by aastaneh (2009-05-27 8:29:42 pm)

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

I haven't done the disk patch yet - I need to look that one up.

There was a recent post on LKML about the locking structures
used by pulse-audio - - they where "tagged" as for KVM - but
perhaps they are general - will have to check on that also.

It has been a long day - more tomorrow.  wink

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Sound (no pun intended - pulse audio uses this) familiar?
http://bugzilla.kernel.org/show_bug.cgi?id=13331
Read comment #5
= = = =
Will check if it is in today's code base, otherwise I will patch it in.

Edit:
Neither the original nor the changed lines of code in the bug
report are in 2.6.30 - - somebody "optimized it" - - I "Un-optimized" it. roll

Edit 2:
Found some more suspicious things in the cache set-up - -
One was a feature I could option out -
The other I sent a query to VIA Tech about.

If this build has a reasonable up-time, I will post it later today.

Edit 3:
I am going to use this on the HP-2133 (C7-M/CN896) and see if I broke that one.  wink
And...
The HP doesn't use the same NIC/driver - which may be the difference.

Edit 4:
Made it worse.  Phooey.  Well, at least I know what I will be doing this afternoon.

Edit 5:
The HP-2133 just past the 6 hour up-time mark. Works fine, if you don't need Wifi.  wink

Last edited by mikez (2009-05-28 1:01:12 pm)

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Close, but no prize yet - -
The 2.6.30 version of the patch seems to be correct -
**But**  I did find some asm macros with missing "memory" notation in the clobber list. 
Will be testing that next (yeah, you gotta tell gcc when you change memory inside asm).

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

The HP-2133 has been up nearly 12 hours now - no sign of problems.
The Cloudbook up-time is reaching a significant length now (for rc7-git3).
= = = =
This futex service the kernel provides user-space also has a potentially
endless loop (our endless loop?) that can hang if cache/memory/IO
coherency fails.
I plan to add an error exit to that routine tomorrow - we should have
another test build tomorrow. 
Note: I am not going to start at 2:30am tomorrow.  wink
= = = =
The project than maintains the USB drivers has a bunch of changes
ready, will check those while listen to music tomorrow - Cloudbook music.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Today' effort (-09149, yyddd) is up at http://hp-umpc.com/ce1200v
NOTE: You have to type or cut&paste that link - it doesn't "click" - that is intentional.

As:

2.6.30-rc7-ce1200v-09149.md5
config-2.6.30-rc7-ce1200v-09149
linux-2.6.30-rc7-ce1200v-09149_2.6.30-rc7-ce1200v-09149-35_i386.deb
linux-firmware-image_2.6.30-rc7-ce1200v-09149-35_all.deb
patch-2.6.30-rc7-ce1200v-09149

If making your own build, you want the config and patch file above plus 2.6.30-rc7-git3 from kernel.org

The patch modifies:

#       modified:   arch/x86/include/asm/alternative.h
#       modified:   arch/x86/include/asm/futex.h
#       modified:   arch/x86/kernel/cpu/centaur.c
#       modified:   arch/x86/pci/common.c
#       modified:   kernel/futex.c

Evidently, the C7M/CX700 is much more sensitive to cache/memory/I-O coherency issues
than other system chip sets used with the C7M.
I don't consider the C7M/CX700 whack-a-bug project finished, but this one is forward progress.

Although not intended for the C7M/CN896 - the above binary will run on the HP-2133 -
perhaps forever - at least longer than 12 hours - music playing.  wink
The firmware image does not include the Broadcom firmware (yet) - -
You may have to kill NetworkManager and NetworkManagerD to keep the SSB driver from
puking all over the kernel message buffer about the missing firmware.

Note: If using the hangcheck timer, unload it before suspending, it doesn't know how
to sleep and wake up - it will reboot the machine as soon as it resumes.  wink

The SMBus driver will load:

modprobe i2c-dev
modprobe i2c-viapro

But sensors-detect from lm-sensors can't find the temperature/voltage sensors on the bus.  [????]

The Padlock drivers load and work:

modprobe padlock-aes
modprobe padlock-sha

The random number generator is built-in, but nothing seems to use it (yet).

Note: This is a **variable speed** kernel and is beating the up-time records set by the
previous fixed-speed kernel builds.  Forward progress.  wink

Enjoy, comments here welcomed.

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Hmm....
The bad:  This thing still has an "up-time limiter" in it, somewhere - - -
The good: I can still test any sanity checker I come up with.  roll
The interesting: 93 *.deb downloads served in the past week, somebody is watching this.  tongue

Last edited by mikez (2009-05-29 8:14:06 am)

01/01/10 >> End of an era, no more Jabber at cb-chat.com

Re: Kernel 2.6.30-rc7 (generic *.deb packaging)

Just tested it... unfortunately I still lose keyboard/mouse in less than 10m uptime.
Dmesg dumped nothing except six of these:

ACPI: Device [FAN0] failed to transition to D3

It's good to see that some progress was made, though.. big_smile

EDIT: When trying to shut down the machine, I get this message instead:

INFO: task phy0:337 blocked for more than 120 seconds.
Kernel panic - not syncing: hung_task: blocked tasks

Last edited by aastaneh (2009-05-29 8:26:55 am)