Intel® Edge Software Hub
Get answers from community peers to your questions about building Edge Software Hub solutions for edge compute node.
Announcements
The Edge Software Vision Package for Red Hat Enterprise Linux is now available here.

Learn more about developing on Intel® Hardware and Software here.
440 Discussions

Problems with the Isecl control plane IP not set even after the work around from troubleshooting

javierfh
Novice
2,872 Views

Hello,

I am trying to deploy an Intel Smart Edge Open Developer Experience Kit, in a Multi-node Cluster without ESP
(in my case, one master and one worker for testing).
I am deploying it in VMs (in openstack) and following these instructions:

https://edc.intel.com/content/www/us/en/design/technologies-and-topics/iot/deploy-smart-edge-open-developer-experience-kit-in-a-multi-node-cluster-w/overview/

The instructions are fine, except with minor hiccups, but then when I get to the deploy part, it starts deploying, but eventually fails.

Here, this is I think the relevant part. 

TASK [validate Intel-secl] **********************************************************************************************************************************************************
task path: /home/ubuntu/dek/tasks/settings_check_ne.yml:62
Thursday 04 May 2023  10:48:15 +0000 (0:00:00.020)       0:00:02.065 **********
included: /home/ubuntu/dek/tasks/../roles/security/isecl/common/tasks/precheck.yml for node01, controller

TASK [Check control plane IP] *******************************************************************************************************************************************************
task path: /home/ubuntu/dek/roles/security/isecl/common/tasks/precheck.yml:7
Thursday 04 May 2023  10:48:15 +0000 (0:00:00.030)       0:00:02.096 **********
fatal: [node01]: FAILED! => {
    "changed": false
}
MSG:
Isecl control plane IP not set!
fatal: [controller]: FAILED! => {
    "changed": false
}
MSG:
Isecl control plane IP not set!

NO MORE HOSTS LEFT ******************************************************************************************************************************************************************
PLAY RECAP **************************************************************************************************************************************************************************
controller                 : ok=12   changed=0    unreachable=0    failed=1    skipped=15   rescued=0    ignored=0
node01                     : ok=12   changed=0    unreachable=0    failed=1    skipped=15   rescued=0    ignored=0
Thursday 04 May 2023  10:48:15 +0000 (0:00:00.025)       0:00:02.121 **********
===============================================================================
gather_facts ------------------------------------------------------------ 1.15s
shell ------------------------------------------------------------------- 0.36s
fail -------------------------------------------------------------------- 0.25s
include_tasks ----------------------------------------------------------- 0.14s
include_vars ------------------------------------------------------------ 0.05s
set_fact ---------------------------------------------------------------- 0.04s
debug ------------------------------------------------------------------- 0.03s
command ----------------------------------------------------------------- 0.03s
assert ------------------------------------------------------------------ 0.02s
stat -------------------------------------------------------------------- 0.02s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
total ------------------------------------------------------------------- 2.08s

2023-05-04 10:48:18.622 INFO: ====================
2023-05-04 10:48:18.623 INFO: DEPLOYMENT RECAP:
2023-05-04 10:48:18.623 INFO: ====================
2023-05-04 10:48:18.623 INFO: DEPLOYMENT COUNT: 1
2023-05-04 10:48:18.623 INFO: SUCCESSFUL DEPLOYMENTS: 0
2023-05-04 10:48:18.623 INFO: FAILED DEPLOYMENTS: 1
2023-05-04 10:48:18.623 INFO: DEPLOYMENT "demo_mep": FAILED
2023-05-04 10:48:18.623 INFO: ====================
2023-05-04 10:48:18.623 INFO: Deployment failed, pulling logs
2023_05_04_10_48_18_SmartEdge_experience_kit_archive.tar.gz                                                                                        100%  795KB  13.1MB/s   00:00
log_collector                                                                                                                                      100%   15KB   2.1MB/s   00:00
log_collector.json                                                                                                                                 100% 3928    11.3MB/s   00:00
ubuntu@controlplane:~/dek$ 


It seems that it failed with the lsecl-control-plane-ip-not-set error, as found on this link
https://www.intel.com/content/www/us/en/docs/ei-for-amr/get-started-guide-server-kit/2022-3/troubleshooting.html#ISECL-CONTROL-PLANE-IP-NOT-SET-ERROR

I had the “platform_attestation_node: false” ; it was already fixed in the instructions in the tutorial.
So do you have any other advice? It seems as if it is the same problem, but probably due to something else. I am happy to send the relevant config files if needed.

Javier

Labels (1)
0 Kudos
1 Solution
JesusE_Intel
Moderator
2,525 Views

Hi javierfh,

 

> How many interfaces you attached to each of the VMs? In my setup, I set two interfaces for each of the VMs, one with NAT used as control/management interface and then a second one, host-only adapter, which is the one I used for the configurations. Maybe this is what causes some trouble at some point. 

Both VMs only have 1 interface configured as bridge.

 

> Would you be so kind and share your /etc/hosts files from both nodes? I configured there IP address of both nodes, because I had trouble with setting the env variables, when following the tutorial.

$ cat /ete/hosts #server1
127.0.0.1 localhost server1
127.0.1.1 server1

#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback server1 server1
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

 

$ cat /ete/hosts #server2
127.0.0.1 localhost server2
127.0.1.1 server2

#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback server2
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

 

The IP address are assigned by the router's DHCP server and the local DNS takes care of the routing in my network. I didn't have to make any additional changes to the hosts file to specify the machines IP address with the hostname.

 

>I believe in this line # set variable for later useexport worker="TBD" # replace with actual value I have to set the name of the worker node, correct?

Following commands were ran on server1 (control-plane node)

export host=$(hostname)

export worker=server2

> And then, what do you exactly mean by that you used 20.04 server? Do you mean that you deployed ubuntu 20.04 server on your VMs? If that is the case is the same I am doing from my side, so if you managed with ubuntu server, we should be good

Correct, I was using Ubuntu 20.04 server on my VM, I thought you were using the desktop version of Ubuntu 20.04. I will keep it the same as yours.

 

Regards,

Jesus

 

 

View solution in original post

0 Kudos
10 Replies
JesusE_Intel
Moderator
2,842 Views

Hi Javier,


That was going to be my suggestion, last time I ran into that problem I had to modify the 10-default.yml config file located in ~/dek/inventory/default/group_vars/all/10-default.yml with the changes described in the link you shared. https://www.intel.com/content/www/us/en/docs/ei-for-amr/get-started-guide-server-kit/2022-3/troubleshooting.html#ISECL-CONTROL-PLANE-IP-NOT-SET-ERROR


I'm not sure if the issue is related to running this software package in a Virtual Machine. I'll have to try to set them up and try them from my end. In the meantime, if you have another physical machine, try installing the package there and see if you run into the same issue.


Regards,

Jesus


0 Kudos
javierfh
Novice
2,781 Views

Hola Jesus,

 

Thanks a lot for your interest.


In the instructions, it says that this should be working with VMs too. In fact, I  managed to deploy the Intel Smart Edge single node on a VM, even though I wasn't able to see my onboarded services in harbor, but that may be another issue.  At the moment, unfortunately I don't have spare machines to try with.

Let see if we can make this work, it look incredibly interesting.

At the same time, I need to check an idea I got. I was looking at the security groups in Openstack and I thought that maybe I need to explicitly enable TCP and UDP traffic between VMs in the tenant (It shouldn't be needed, but just in case the configuration of the network is filtering some traffic).

I´ll keep you posted if I manage to advance and I appreciate if you could also keep me posted if you discover something.

Thanks a lot, 

Javier

0 Kudos
javierfh
Novice
2,712 Views

Hi, 

 

I have been trying to see if I can make this work with the current setup, but we are having difficulties.
Things that I have tried:

  • I was thinking if the configurations, were not modified properly and noticed that there is a folder automatically created when trying the deployment, with the name of the deployment (in my case demo_mep)

 

ubuntu@controlplane:~/dek/inventory/automated$ ls 
demo_mep

 

And there it seems that if again it copies the group_vars and the host_vars

 

ubuntu@controlplane:~/dek/inventory/automated/demo_mep$ ls
group_vars  host_vars  inventory_demo_mep.yml

 

So, I tried to remove the created folder, after doing changes in the config files and prior to trying to re-deploy, to see if the files were not properly updated, but it did not help. Still the same problem. 

  • I was also reviewing /dek/inventory/automated/demo_mep/group_vars/all/10-default.yml and there, out of desperation and trying to simplify things, I disabled

 

# Disable sriov kernel flags (intel_iommu=on iommu=pt)
# iommu_enabled: true
iommu_enabled: false

Hoping to disable sriov, in case this is the thing messing things here.
Or should I explicitly disable the sriov here:

## SR-IOV Network Operator
sriov_network_operator_enable: true
## SR-IOV Network Operator configuration
sriov_network_operator_configure_enable: true
## SR-IOV Network Nic's Detection Application
sriov_network_detection_application_enable: false​

 

As you can see, I was guessing here, since it is a problem with connectivity, I thought if SRIOV here was the culprit and I was trying to simplify things, but I may be totally wrong.

  • I looked into the openstack connectivity and explicitly enabled TCP and UDP traffic between the instances in the same tenant (even if it shouldn't be necessary). It didn't help either.

  • Next thing I will try is to create two individual VMs with VirtualBox or VMware in case Openstack is the one preventing the deployment.
  • And last but not least, I would like if you guys have some public slack or irc where developers meet and help each other?

Any help would be much appreciated,

Javier

0 Kudos
JesusE_Intel
Moderator
2,681 Views

Hi Javier,


I appreciate the detailed explanation of what you have tried. I've been trying to configure a local environment to test this, unfortunately, I have not been successful. Let me try to setup another system. would also recommend checking that you are able to ping both VMs using their hostname. This should be configured at the DHCP server side and not on each client/VM.


I will provide another update shortly, appreciate your patience.


Regards,

Jesus


0 Kudos
javierfh
Novice
2,667 Views

Hi Jesus,

 

On the contrary, thanks to you for your efforts.
I wanted also to update you on my work. 

We decided to skip the deployment in VMs provided by Openstack, just in case that Openstack Neutron, the one taking care of networking, is doing something that prevents the scripts from running properly.

Now what we decided, was to create two different VMs with Virtualbox and it seems that we are very close. I still think we have some problems, maybe related to the hostname and DNS.
I have seen that at some point, it shows the url and even the user and password for harbor, and using the hostname name stored in $host  wont work, but if instead I use the IP address itself, I can log in into harbor and I see that it has deployed things there.
After that, what I did was to add the IP and hostname of the control_plane machine in the /etc/hosts, in order for the script to be find the right address, but still there seems to be some trouble.

Attached here, the last part of the log file.

I hope this clarifies little more what is the current solution.
Let me please know if you would need any more info to help you understanding the situation

Javier

 

TASK [kubernetes/harbor_registry/controlplane : Check if intel project exists] ****************************************************************************
task path: /home/javier/dek/roles/kubernetes/harbor_registry/controlplane/tasks/main.yml:24
Tuesday 16 May 2023  16:56:13 +0000 (0:00:00.026)       0:01:23.166 *********** 
FAILED - RETRYING: Check if intel project exists (10 retries left).
FAILED - RETRYING: Check if intel project exists (9 retries left).
FAILED - RETRYING: Check if intel project exists (8 retries left).
FAILED - RETRYING: Check if intel project exists (7 retries left).
FAILED - RETRYING: Check if intel project exists (6 retries left).
FAILED - RETRYING: Check if intel project exists (5 retries left).
FAILED - RETRYING: Check if intel project exists (4 retries left).
FAILED - RETRYING: Check if intel project exists (3 retries left).
FAILED - RETRYING: Check if intel project exists (2 retries left).
FAILED - RETRYING: Check if intel project exists (1 retries left).
fatal: [controller]: FAILED! => {
    "attempts": 10,
    "changed": false,
    "connection": "close",
    "content": "",
    "content_length": "150",
    "content_type": "text/html",
    "date": "Tue, 16 May 2023 16:58:08 GMT",
    "elapsed": 1,
    "redirected": false,
    "server": "nginx",
    "status": 502,
    "url": "https://isecontrolplane:30003/api/v2.0/projects?project_name=intel"
}

MSG:

Status code was 502 and not [200, 404]: HTTP Error 502: Bad Gateway

NO MORE HOSTS LEFT ****************************************************************************************************************************************

PLAY RECAP ************************************************************************************************************************************************
controller                 : ok=173  changed=18   unreachable=0    failed=1    skipped=168  rescued=0    ignored=0   
node01                     : ok=130  changed=10   unreachable=0    failed=0    skipped=130  rescued=0    ignored=0   

Tuesday 16 May 2023  16:58:08 +0000 (0:01:54.596)       0:03:17.762 *********** 
=============================================================================== 
kubernetes/harbor_registry/controlplane ------------------------------- 115.00s
infrastructure/install_dependencies ------------------------------------ 18.76s
infrastructure/docker -------------------------------------------------- 18.35s
baseline_ansible/infrastructure/install_packages ------------------------ 9.01s
gather_facts ------------------------------------------------------------ 5.56s
kubernetes/install ------------------------------------------------------ 4.81s
kubernetes/cni/calico/controlplane -------------------------------------- 4.08s
infrastructure/os_setup ------------------------------------------------- 3.76s
baseline_ansible/infrastructure/os_requirements/disable_swap ------------ 2.19s
infrastructure/firewall_open_ports -------------------------------------- 1.93s
baseline_ansible/infrastructure/os_requirements/dns_stub_listener ------- 1.34s
infrastructure/conditional_reboot --------------------------------------- 1.14s
baseline_ansible/infrastructure/os_proxy -------------------------------- 1.01s
infrastructure/install_hwe_kernel --------------------------------------- 0.96s
kubernetes/cni ---------------------------------------------------------- 0.81s
baseline_ansible/infrastructure/configure_udev -------------------------- 0.74s
baseline_ansible/infrastructure/time_verify_ntp ------------------------- 0.72s
baseline_ansible/infrastructure/install_openssl ------------------------- 0.71s
kubernetes/helm --------------------------------------------------------- 0.71s
fail -------------------------------------------------------------------- 0.49s
kubernetes/customize_kubelet -------------------------------------------- 0.47s
baseline_ansible/infrastructure/install_userspace_drivers --------------- 0.46s
kubernetes/controlplane ------------------------------------------------- 0.45s
shell ------------------------------------------------------------------- 0.42s
kubernetes/cert_manager ------------------------------------------------- 0.40s
infrastructure/build_noproxy -------------------------------------------- 0.35s
baseline_ansible/infrastructure/install_golang -------------------------- 0.33s
infrastructure/check_redeployment --------------------------------------- 0.30s
include_tasks ----------------------------------------------------------- 0.29s
baseline_ansible/infrastructure/os_requirements/enable_ipv4_forwarding --- 0.26s
infrastructure/git_repo_tool -------------------------------------------- 0.23s
debug ------------------------------------------------------------------- 0.19s
infrastructure/e810_driver_update --------------------------------------- 0.18s
baseline_ansible/infrastructure/time_setup_ntp -------------------------- 0.17s
baseline_ansible/infrastructure/configure_cpu_isolation ----------------- 0.17s
baseline_ansible/infrastructure/selinux --------------------------------- 0.14s
infrastructure/provision_sgx_enabled_platform --------------------------- 0.11s
baseline_ansible/infrastructure/configure_cpu_idle_driver --------------- 0.10s
baseline_ansible/infrastructure/configure_additional_grub_parameters ---- 0.10s
baseline_ansible/infrastructure/configure_hugepages --------------------- 0.09s
baseline_ansible/infrastructure/configure_sriov_kernel_flags ------------ 0.09s
baseline_ansible/infrastructure/disable_fingerprint_authentication ------ 0.07s
set_fact ---------------------------------------------------------------- 0.05s
include_vars ------------------------------------------------------------ 0.05s
baseline_ansible/infrastructure/install_rt_package ---------------------- 0.04s
infrastructure/setup_baseline_ansible ----------------------------------- 0.03s
assert ------------------------------------------------------------------ 0.03s
baseline_ansible/infrastructure/install_tuned_rt_profile ---------------- 0.03s
command ----------------------------------------------------------------- 0.02s
stat -------------------------------------------------------------------- 0.02s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
total ----------------------------------------------------------------- 197.72s

 


And here attached a picture of the harbor as it is right now, but replacing the hostname by the ip

chrome_eJLgCKlSSo.png

 
0 Kudos
JesusE_Intel
Moderator
2,604 Views

Hi javierfh,

 

I setup two virtual machines running on the same Windows 10 host system. Using VirtualBox to run both VMs with default settings with the exception of network type, I used bridged instead of the default NAT. Once both VMs were up and running, I confirmed I can ping each other using their hostname.

 

I proceeded to install Smart Edge Open using the same guide as you with no modifications, and had a successful deployment.

Overview - 1.0 - ID:764706 | Deploy Intel® Smart Edge Open Developer Experience Kit in a Multi-node Cluster without Edge Software Provisioner (ESP)

 

I was able to access both Grafana and Harbor web servers. This leads me to believe, the issue you are experiencing is related to your local network. I also want to mention that I used Ubuntu 20.04 server. I will try to setup two new Virtual Machines with Ubuntu 20.04 Desktop next week.

server1.PNG

 

webservers.PNG

Regards,

Jesus

 

0 Kudos
javierfh
Novice
2,552 Views

Hi Jesus, 
Let me first thank you for your support and patience.

So I will try to start from scratch (in case something from the old deployment will cause troubles). I will try to follow your suggestion and configure the network interface in the VM as bridge instead of NAT.
But let me please ask you couple of questions:

  1. How many interfaces you attached to each of the VMs? In my setup, I set two interfaces for each of the VMs, one with NAT used as control/management interface and then a second one, host-only adapter, which is the one I used for the configurations. Maybe this is what causes some trouble at some point. 
  2.  Would you be so kind and share your /etc/hosts files from both nodes? I configured there IP address of both nodes, because I had trouble with setting the env variables, when following the tutorial.
  3. I believe in this line 
    # set variable for later use
    export worker="TBD" # replace with actual value

    I have to set the name of the worker node, correct?

  4. And then, what do you exactly mean by that you used 20.04 server? Do you mean that you deployed ubuntu 20.04 server on your VMs? If that is the case is the same I am doing from my side, so if you managed with ubuntu server, we should be good

Anyway thanks for all your help. It sounds very good to hear you managed to deploy with VMs, I feel optimistic hoping I can get it deployed with you help.

Javier

0 Kudos
JesusE_Intel
Moderator
2,526 Views

Hi javierfh,

 

> How many interfaces you attached to each of the VMs? In my setup, I set two interfaces for each of the VMs, one with NAT used as control/management interface and then a second one, host-only adapter, which is the one I used for the configurations. Maybe this is what causes some trouble at some point. 

Both VMs only have 1 interface configured as bridge.

 

> Would you be so kind and share your /etc/hosts files from both nodes? I configured there IP address of both nodes, because I had trouble with setting the env variables, when following the tutorial.

$ cat /ete/hosts #server1
127.0.0.1 localhost server1
127.0.1.1 server1

#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback server1 server1
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

 

$ cat /ete/hosts #server2
127.0.0.1 localhost server2
127.0.1.1 server2

#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback server2
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

 

The IP address are assigned by the router's DHCP server and the local DNS takes care of the routing in my network. I didn't have to make any additional changes to the hosts file to specify the machines IP address with the hostname.

 

>I believe in this line # set variable for later useexport worker="TBD" # replace with actual value I have to set the name of the worker node, correct?

Following commands were ran on server1 (control-plane node)

export host=$(hostname)

export worker=server2

> And then, what do you exactly mean by that you used 20.04 server? Do you mean that you deployed ubuntu 20.04 server on your VMs? If that is the case is the same I am doing from my side, so if you managed with ubuntu server, we should be good

Correct, I was using Ubuntu 20.04 server on my VM, I thought you were using the desktop version of Ubuntu 20.04. I will keep it the same as yours.

 

Regards,

Jesus

 

 

0 Kudos
javierfh
Novice
2,489 Views

Hi Jesus, 

 

I finally have good news. I managed to deploy successfully the multi node smart edge in two virtual box VMs.

I did as you said, configured only one interface as a bridged adapter. VirtualBox selected the default connectivity in my laptop, which was the WIFI. I didn't have connectivity, so when I changed it to an ethernet interface, connectivity returned.

What I did next was to save the env variables (export hostname on the master and worker) in the .profile file, so that they will survive reboots. Then I followed your suggestions and configured the /etc/hosts as yours, but in my case it wouldn't work. I had to add the IP of the worker node in the /etc/hosts of the master, because otherwise wouldn't be able to send the ID to the worker for the SSH (# to worker ssh-copy-id $USER@$worker).

Once that was clear, I started with the deployment, but then got stuck on the ufw configuration.
I rebooted and looked at the logs, and failed somehow with problems that I assumed were related to the worker node not being able to locate the master node. So once I added the IP of the master in the /etc/hosts file at the worker, the deployment succeeded. Hurray!!

 

I wonder why in your case you didn't need to add the IPs to the /etc/hosts file, but oh well.. now it works.

 

So now, once the deployment is done, I would like to ask for your help, in case you have some pointers to documentation on how to use the deployed edge. In particular, we would like to know if there is an architecture document and some guide for next steps on how to onboard services, how to configure them and see if we can control placement for the instantiated services, in other words, if we can select in which node services or pods will be instantiated. It would be also helpful, if there is some doc on how to use the different deployed tools. We also want to understand the interfaces or APIs to see if we can integrate it with our MEC orchestrator.

 

Anyway, it has been great all your help and I don't know if possible, but maybe it would be good to change the title of the post to something like Problems(/Solutions/lessons learned) on how to deploy Intel Smart Edge Open in virtual machines, in case it could help someone else.

 

Thanks once more and looking forward to hearing from you,

Javier

 

0 Kudos
JesusE_Intel
Moderator
2,478 Views

Hi Javier,


I'm glad to hear that it is working for you! I believe my local DHCP and DNS server took care of the routing without needing to update the /etc/hosts files.


Please find additional information about Smart Edge Open in the documentation below, there is also information about the API.

Intel® Smart Edge Open | Developer Experience Kit Default Install (intelsmartedge.github.io)


If you have any questions, please start a new discussion as this one will no longer be monitored.


Regards,

Jesus


0 Kudos
Reply