Discussion:
[Bug 1771679] Re: Kernel panic on boot (m1.small in cn-north-1)
(too old to reply)
Kamal Mostafa
2018-05-17 20:51:31 UTC
Permalink
** Also affects: linux (Ubuntu Artful)
Importance: Undecided
Status: New

** Also affects: linux-aws (Ubuntu Artful)
Importance: Undecided
Status: New

** Changed in: linux (Ubuntu Artful)
Status: New => In Progress

** Changed in: linux (Ubuntu Artful)
Assignee: (unassigned) => Kamal Mostafa (kamalmostafa)

** No longer affects: linux-aws (Ubuntu Artful)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771679

Title:
Kernel panic on boot (m1.small in cn-north-1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1771679/+subscriptions
--
ubuntu-bugs mailing list
ubuntu-***@lists.ubuntu.com
https://lists.ubun
Ubuntu Kernel Bot
2018-05-17 20:00:05 UTC
Permalink
This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1771679

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771679

Title:
Kernel panic on boot (m1.small in cn-north-1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1771679/+subscriptions
--
ubuntu-bugs mailing list
ubuntu-***@lists.ubuntu.com
https://lists.ubuntu.com/mailma
Kamal Mostafa
2018-05-17 19:42:37 UTC
Permalink
This is documented and fixed by mainline commit:

commit d1ecfa9d1f402366b1776fbf84e635678a51414f
Author: van der Linden, Frank <***@amazon.com>
Date: Fri May 4 16:11:00 2018 -0400

x86/xen: Reset VCPU0 info pointer after shared_info remap

This patch fixes crashes during boot for HVM guests on older (pre HVM
vector callback) Xen versions. Without this, current kernels will always
fail to boot on those Xen versions.

Sample stack trace:

BUG: unable to handle kernel paging request at ffffffffff200000
IP: __xen_evtchn_do_upcall+0x1e/0x80
PGD 1e0e067 P4D 1e0e067 PUD 1e10067 PMD 235c067 PTE 0
Oops: 0002 [#1] SMP PTI
Modules linked in:
CPU: 0 PID: 512 Comm: kworker/u2:0 Not tainted 4.14.33-52.13.amzn1.x86_64
#1
Hardware name: Xen HVM domU, BIOS 3.4.3.amazon 11/11/2016
task: ffff88002531d700 task.stack: ffffc90000480000
RIP: 0010:__xen_evtchn_do_upcall+0x1e/0x80
RSP: 0000:ffff880025403ef0 EFLAGS: 00010046
RAX: ffffffff813cc760 RBX: ffffffffff200000 RCX: ffffc90000483ef0
RDX: ffff880020540a00 RSI: ffff880023c78000 RDI: 000000000000001c
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880025403f5c R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880025400000(0000) knlGS:0000000000000
000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff200000 CR3: 0000000001e0a000 CR4: 00000000000006f0
Call Trace:
<IRQ>
do_hvm_evtchn_intr+0xa/0x10
__handle_irq_event_percpu+0x43/0x1a0
handle_irq_event_percpu+0x20/0x50
handle_irq_event+0x39/0x60
handle_fasteoi_irq+0x80/0x140
handle_irq+0xaf/0x120
do_IRQ+0x41/0xd0
common_interrupt+0x7d/0x7d
</IRQ>

During boot, the HYPERVISOR_shared_info page gets remapped to make it work
with KASLR. This means that any pointer derived from it needs to be
adjusted.

The only value that this applies to is the vcpu_info pointer for VCPU 0.
For PV and HVM with the callback vector feature, this gets done via the
smp_ops prepare_boot_cpu callback. Older Xen versions do not support the
HVM callback vector, so there is no Xen-specific smp_ops set up in that
scenario. So, the vcpu_info pointer for VCPU 0 never gets set to the proper
value, and the first reference of it will be bad. Fix this by resetting it
immediately after the remap.

Signed-off-by: Frank van der Linden <***@amazon.com>
Reviewed-by: Eduardo Valentin <***@amazon.com>
Reviewed-by: Alakesh Haloi <***@amazon.com>
Reviewed-by: Vallish Vaidyeshwara <***@amazon.com>
Reviewed-by: Boris Ostrovsky <***@oracle.com>
Cc: Juergen Gross <***@suse.com>
Cc: Boris Ostrovsky <***@oracle.com>
Cc: xen-***@lists.xenproject.org
Signed-off-by: Boris Ostrovsky <***@oracle.com>


** Changed in: linux-aws (Ubuntu Bionic)
Status: New => In Progress

** Also affects: linux (Ubuntu)
Importance: Undecided
Status: New

** Changed in: linux (Ubuntu Bionic)
Assignee: (unassigned) => Kamal Mostafa (kamalmostafa)

** Changed in: linux (Ubuntu Bionic)
Status: New => In Progress
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771679

Title:
Kernel panic on boot (m1.small in cn-north-1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1771679/+subscriptions
--
ubuntu-bugs mailing list
ubuntu-***@lists.ubuntu.com
https://lis
Loading...