What is Embodied AI?

Traditional AI operates in the digital realm, but embodied AI takes things a step further. It integrates AI algorithms with robots or other physical systems equipped with sensors, like cameras, LiDAR, and haptic sensors. These sensors provide real-world data for the AI to process, giving it a “body” to experience the world directly.

Image description

Why is it important?

Embodied AI can overcome limitations of purely digital AI, especially in tasks requiring physical manipulation. Imagine robots that:

Manufacture: Perform precise assembly and quality control.

Deliver: Navigate warehouses and handle packages autonomously.

Healthcare: Assist surgeons and guide rehabilitation therapy.

Explore: Venture into space and perform underwater rescues.

Image description

How does it work?

Sensors: Gather data from the environment (vision, touch, etc.).

Perception: Algorithms interpret the data and create a surroundings representation.

Decision-making: AI models determine the best course of action.

Action: Algorithms control the robot’s movements and interactions.

Learning: The AI adapts and improves its performance based on experience.

It’s a continuous cycle of learning and adaptation, making embodied AI incredibly powerful.

Image description

The future:

While still in its early stages, embodied AI holds immense potential. We can expect advancements in:

Robot dexterity and adaptability: More versatile and agile robots for diverse tasks.

AI-human collaboration: Seamless interaction and cooperation between humans and robots.

Ethical considerations: Addressing societal and ethical concerns surrounding embodied AI.

Note: this article is edited with help of Gemini

New task

Recently, I was assigned a new task: to enable teleoperation for a compact mobile robot. This little bot boasts two motorized wheels (equipped with Maxon EPOS4 motors) and a set of caster wheels. In the preceding months, my colleagues have installed ROS2 (Humble) on the robot’s onboard computer. They also git cloned the ros2_canopen stack and attempted to control the robot using the “diff_drive_controller” — following the example “Pilz Manipulator PRBT 6” in the official documentation of ros2_canopen. Despite their efforts, the robot remained stubbornly stationary.

Then, the torch has been passed to me. I’ve been asked to continue this work, unravel the mystery, and breathe life into our mini mobile marvel. Let’s get those wheels turning!

Image description

Embarking on this task, I delved into the ros2_canopen documentation. Candidly, it wasn’t the most user-friendly read. Consequently, I found myself occasionally spelunking through the source code to grasp its concepts, overall structure, and various parameter configurations.

Next, I created a package called single_epos4_canopen — a modest creation aimed at controlling a solitary Epos4 motor using ros2_canopen. My inspiration? Well, this lightweight package sidesteps high-level features like ros2_control and robot description. Instead, it zeroes in on the nitty-gritty: CAN communication debugging.

With single_epos4_canopen in action, I managed to make wheel rotate by calling ros2 service call /single_motor/target canopen_interfaces/srv/COTargetDouble “{ target: 10.0 }” service (thanks to our default “velocity mode”). Ahoy, success! This, at least, confirms that our CAN network is fine.

And when it’s time to halt the motion, ros2 service call /single_motor/halt std_srvs/srv/Trigger does the trick. Similarly I can call nmt_state_reset service to restart the NMT of the node whose Heartbeat is configured to be sent every 1000ms.

CanOpen frames analysis

Thanks to the single_epos4_canopen package, I can monitor and analyze the CanOpen frames by calling

1
candump can0 

(sorry I was not allowed to share the images).

By conducting tests on respectively each CanOpen node and checking carefully the CanOpen frames, I found that issues come from the OD definitions. Updating the wrong OD entries via Epos Studio finally fixed those issues.

Use ros2_control with ros2_canopen

By following this example, I created the package for controlling both motors with help of diff_drive_controller in ros2_control. After having been struggling with the configurations and tests for multiple days, and by making some workaround to fix errors in some function for handling init, finally I can control the robot with joystick:

Simulation

Afterward, I spent 2 more days to update the robot description file, added some gazebo control and sensor plugins and make the simulated bot tele-operable in Gazebo simulator via “teleop_twist_keyboard”:

Future work

For some reason, this project has been paused, though I would like to continue to achieve autonomous navigation and fleet control on this robot. Probably there will be no more updates about future developments :(

Robotics is a multidisciplinary field that encompasses various aspects of engineering, computer science, and mathematics. Becoming an excellent robotics software engineer requires a combination of technical expertise, creative problem-solving abilities, and a passion for technology. Here are some key skills one should master to excel in this field:

  1. Strong Programming Skills: A solid foundation in programming languages like C++, Python, Rust and Java is essential for developing the software that controls and interacts with robots. Familiarity with robotics-specific frameworks and libraries such like ROS (Robot Operating System) and OpenCV (Open Computer Vision) is also highly valuable.

  2. Mathematics and Physics: Robotics heavily relies on mathematical principles and physical concepts. A strong background in algebra, calculus, geometry, physics, and mechanics is crucial for understanding how robots move, interact with their environment, and execute various tasks.

  3. Control Systems: Control systems play a critical role in ensuring that robots operate safely and efficiently. Knowledge of control theory, linear algebra, and feedback mechanisms is essential for designing control algorithms that stabilize robot movements, maintain position, and handle unexpected situations.

  4. Sensors and Actuators: Robotics systems rely on various sensors to gather information about the environment and actuators to control robot movements. Familiarity with different types of sensors (cameras, lidar, ultrasonic, force/torque sensors, etc.) and actuators (motors, servos, pneumatics) is essential for designing and implementing robust robot control systems.

  5. Kinematics and Dynamics: Kinematics deals with the study of robot motion without considering the masses/inertia/forces/torques involved, while dynamics considers the forces and torques that act on the robot (with body masses and inertia) and how they affect its motion. Understanding kinematics and dynamics is crucial for designing robots that move safely, accurately and efficiently.

  6. Computer Vision: Computer vision techniques enable robots to perceive and understand their surroundings through visual inputs. Familiarity with image processing, object recognition, and machine learning algorithms is essential for developing robots with autonomous navigation, object manipulation, and scene understanding capabilities.

  7. Algorithms and AI: Enhancing the intelligence of robots has been a constant goal in the robotics doamin. It involves incorporating advanced technologies and methodologies to enable robots to perform complex tasks, adapt to changing environments, and learn from experience. There have been various classic AI algorithms such like Dijkstra, A*, SLAM, RRT, Optimizations(QP, SQP, etc). As AI revolution era is approaching, deep learning (DL), reinforcement learning (RL), and large language models (LLMs) are increasingly being used in robotics field, to train robots for tasks such as object recognition, navigation, manipulation, decision making, etc. Matering classical and state-of-the-art algorithms is crucial for robotics softare engineers.

  8. Human-Robot Interaction (HRI): Robotics is increasingly focused on interacting with humans in a safe and intuitive manner. Understanding human-computer interaction principles, natural language processing, and gesture recognition is crucial for designing robots that can collaborate effectively with humans.

  9. Problem-Solving and Debugging Skills: Robotics engineers face complex challenges in designing, developing, and testing robot systems. Excellent problem-solving skills, coupled with strong debugging abilities, are essential for identifying and resolving software issues, ensuring robot functionality, and improving performance.

  10. Creative Thinking and Innovation: The field of robotics is constantly evolving, and successful engineers need to be creative and innovative to develop new solutions and adapt to emerging technologies. A willingness to think outside the box and explore unconventional approaches is highly valued in this field. With the rapid pace of technological advancements, it’s crucial for a robotics software engineer to follow the latest trends and innovations to remain competitive and informed.

  11. Communication and Collaboration Skills: Robotics projects often involve collaboration with various stakeholders, including hardware engineers, mechanical engineers, software developers, and end-users. Especially, more and more robotics companies are adopting in recent years “Scrum” methodology, which requires active communications and collaboration within scrum team. Effective communication and teamwork skills are essential for coordinating efforts, sharing ideas, studying blocking factors and ensuring a successful outcome.

Last year, I have used Netlink in my development work for collecting some events from kernel space. I present in this post some basic practice I have done when I learnt to program with Netlink.

Introduction

Netlink is a Linux kernel interface used for inter-process communication (IPC) between both the kernel and userspace processes, and between different userspace processes, in a way similar to the Unix domain sockets. Similarly to the Unix domain sockets, and unlike INET sockets, Netlink communication cannot traverse host boundaries. Netlink provides a standard socket-based interface for userspace processes, and a kernel-side API for internal use by kernel modules. Originally, Netlink used the AF_NETLINK socket family. Netlink is designed to be a more flexible successor to ioctl; RFC 3549 describes the protocol in detail.

My practice

Kernel module

Below is the content of my source file “netlink_kernel.c” which yields a kernel module. There is macro MY_NETLINK 30 which defines my customized netlink protocol. One can also choose available protocols such like NETLINK_ROUTE or NETLINK_INET_DIAG. All available protocols can be seen in this page. Function netlink_kernel_create() creates a netlink socket for user space application to communicate with. More information about netlink programming can found here.

netlink_kernel.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#include <linux/module.h>
#include <net/sock.h>
#include <linux/netlink.h>
#include <linux/skbuff.h>

/*
refer to https://elixir.bootlin.com/linux/v5.15.13/source/include/linux/netlink.h
*/

#define MY_NETLINK 30 // cannot be larger than 31, otherwise we shall get "insmod: ERROR: could not insert module netlink_kernel.ko: No child processes"


struct sock *nl_sk = NULL;

static void myNetLink_recv_msg(struct sk_buff *skb)
{
struct nlmsghdr *nlhead;
struct sk_buff *skb_out;
int pid, res, msg_size;
char *msg = "Hello msg from kernel";


printk(KERN_INFO "Entering: %s\n", __FUNCTION__);

msg_size = strlen(msg);

nlhead = (struct nlmsghdr*)skb->data; //nlhead message comes from skb's data... (sk_buff: unsigned char *data)

printk(KERN_INFO "MyNetlink has received: %s\n",(char*)nlmsg_data(nlhead));


pid = nlhead->nlmsg_pid; // Sending process port ID, will send new message back to the 'user space sender'


skb_out = nlmsg_new(msg_size, 0); //nlmsg_new - Allocate a new netlink message: skb_out

if(!skb_out)
{
printk(KERN_ERR "Failed to allocate new skb\n");
return;
}

nlhead = nlmsg_put(skb_out, 0, 0, NLMSG_DONE, msg_size, 0); // Add a new netlink message to an skb

NETLINK_CB(skb_out).dst_group = 0;


strncpy(nlmsg_data(nlhead), msg, msg_size); //char *strncpy(char *dest, const char *src, size_t count)

res = nlmsg_unicast(nl_sk, skb_out, pid);

if(res < 0)
printk(KERN_INFO "Error while sending back to user\n");
}

static int __init myNetLink_init(void)
{
struct netlink_kernel_cfg cfg = {
.input = myNetLink_recv_msg,
};

/*netlink_kernel_create() returns a pointer, should be checked with == NULL */
nl_sk = netlink_kernel_create(&init_net, MY_NETLINK, &cfg);
printk("Entering: %s, protocol family = %d \n",__FUNCTION__, MY_NETLINK);
if(!nl_sk)
{
printk(KERN_ALERT "Error creating socket.\n");
return -10;
}

printk("MyNetLink Init OK!\n");
return 0;
}

static void __exit myNetLink_exit(void)
{
printk(KERN_INFO "exiting myNetLink module\n");
netlink_kernel_release(nl_sk);
}

module_init(myNetLink_init);
module_exit(myNetLink_exit);
MODULE_LICENSE("GPL");

User space application

Below are the content of my file “netlink_client.c”. This program shall open a netlink socket with the same protocol and allow user to send/receive msg to/from kernel module in previous section.

netlink_client.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <string.h>
#include <asm/types.h>
#include <linux/netlink.h>
#include <linux/socket.h>
#include <errno.h>

#define NETLINK_USER 30 // same customized protocol as in my kernel module
#define MAX_PAYLOAD 1024 // maximum payload size

struct sockaddr_nl src_addr, dest_addr;
struct nlmsghdr *nlh = NULL;
struct nlmsghdr *nlh2 = NULL;
struct msghdr msg, resp; // famous struct msghdr, it includes "struct iovec * msg_iov;"
struct iovec iov, iov2;
int sock_fd;

int main(int args, char *argv[])
{
//int socket(int domain, int type, int protocol);
sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_USER); //NETLINK_KOBJECT_UEVENT

if(sock_fd < 0)
return -1;

memset(&src_addr, 0, sizeof(src_addr));
src_addr.nl_family = AF_NETLINK;
src_addr.nl_pid = getpid(); /* self pid */

//int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
if(bind(sock_fd, (struct sockaddr*)&src_addr, sizeof(src_addr))){
perror("bind() error\n");
close(sock_fd);
return -1;
}

memset(&dest_addr, 0, sizeof(dest_addr));
dest_addr.nl_family = AF_NETLINK;
dest_addr.nl_pid = 0; /* For Linux Kernel */
dest_addr.nl_groups = 0; /* unicast */

//nlh: contains "Hello" msg
nlh = (struct nlmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD));
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid(); //self pid
nlh->nlmsg_flags = 0;

//nlh2: contains received msg
nlh2 = (struct nlmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD));
memset(nlh2, 0, NLMSG_SPACE(MAX_PAYLOAD));
nlh2->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh2->nlmsg_pid = getpid(); //self pid
nlh2->nlmsg_flags = 0;

strcpy(NLMSG_DATA(nlh), "Hello this is a msg from userspace"); //put "Hello" msg into nlh

iov.iov_base = (void *)nlh; //iov -> nlh
iov.iov_len = nlh->nlmsg_len;
msg.msg_name = (void *)&dest_addr; //msg_name is Socket name: dest
msg.msg_namelen = sizeof(dest_addr);
msg.msg_iov = &iov; //msg -> iov
msg.msg_iovlen = 1;

iov2.iov_base = (void *)nlh2; //iov -> nlh2
iov2.iov_len = nlh2->nlmsg_len;
resp.msg_name = (void *)&dest_addr; //msg_name is Socket name: dest
resp.msg_namelen = sizeof(dest_addr);
resp.msg_iov = &iov2; //resp -> iov
resp.msg_iovlen = 1;



printf("Sending message to kernel\n");

int ret = sendmsg(sock_fd, &msg, 0);
printf("send ret: %d\n", ret);

printf("Waiting for message from kernel\n");

/* Read message from kernel */
recvmsg(sock_fd, &resp, 0); //msg is also receiver for read

printf("Received message payload: %s\n", (char *) NLMSG_DATA(nlh2));

char usermsg[MAX_PAYLOAD];
while (1) {
printf("Input your msg for sending to kernel: ");
scanf("%s", usermsg);

strcpy(NLMSG_DATA(nlh), usermsg); //put "Hello" msg into nlh


printf("Sending message \" %s \" to kernel\n", usermsg);

ret = sendmsg(sock_fd, &msg, 0);
printf("send ret: %d\n", ret);

printf("Waiting for message from kernel\n");

/* Read message from kernel */
recvmsg(sock_fd, &resp, 0); //msg is also receiver for read

printf("Received message payload: %s\n", (char *)NLMSG_DATA(nlh2));

}
close(sock_fd);

return 0;
}

Makefile

Create a Makefile with the following content which will enable us to easily compile the source files.

Makefile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
obj-m += netlink_kernel.o

#generate the path
CURRENT_PATH:=$(shell pwd)

#the current kernel version number
LINUX_KERNEL:=$(shell uname -r)

#the absolute path
LINUX_KERNEL_PATH:=/usr/src/linux-headers-$(LINUX_KERNEL)



#complie object
# extension of "make modules" cmd with -C option and "M=dir" configuration
# this cmd will switch working directory to the given path followed by the -C option
# and will search specified source files from the given path configured by "M="
# and compile them to generate ko files

all:
@echo $(LINUX_KERNEL_PATH)
make -C $(LINUX_KERNEL_PATH) M=$(CURRENT_PATH) modules

client:
gcc netlink_client.c -o netlink_client -g

#clean
clean:
make -C $(LINUX_KERNEL_PATH) M=$(CURRENT_PATH) clean
rm netlink_client

Test the communication

Make sure that files “netlink_kernel.c”, “netlink_client.c” and “Makefile” are in the same directory. In a Linux terminal window, cd into this directory and start the compilation:

1
2
make 
make client

Normally kernel module file “netlink_kernel.ko” and user application file “netlink_client” will be generated.

Load the generated kernel module “netlink_kernel.ko” to linux kernel:

1
sudo insmod netlink_kernel.ko

Execute in a new terminal the following cmd to monitor the kernel messages:

1
dmesg -Hw

Now start the user application to start the communication:

1
./netlink_client

Below is the screenshot of my test:
Image description

What is core dump

A core dump is a file of a computer’s documented memory of when a program or computer crashed. The file consists of the recorded status of the working memory at an explicit time, usually close to when the system crashed or when the program ended atypically. So a core dump file is very important for program developers since we often need it to locate and fix crash issues.

How to activate “core dump”

  1. Firstly one can type ulimit -c in terminal console, if the output is 0,it means that the “core dump” is deactivated by default. In this case, Linux will not generate core dump file when a program crashes.

  2. Then one can activate “core dump” by setting a non-zero size for generated core dump file by executing ulimit -c [kbytes]. For example:

  • ulimit -c 100:set maximum size of core dump file as 100k
  • ulimit -c unlimited: no size limit of core dump file

Configuration of core dump

Once the core dump has been activated, we can get core dump file generated in local repository. There is still one issue: the core dump file has always the same name as “core” thus it is always over-written if multiple program crashes occur. What if we want to add pid in generated file name, or if we want this file is generated in some other directory?

  1. core dump filename with process id
1
echo 1 > /proc/sys/kernel/core_uses_pid

Then process id will be attached in file name.

  1. One can set the output path and coredump file name pattern by updating sysctl variable “kernel.core_pattern” in file “/etc/sysctl.conf”

Add the following phrase at the end of file sysctl.conf:

1
kernel.core_pattern = /var/core/core_%e_%p

Save and exit.

Below are the available options in core_pattern setting:

  • %c maximum size of the generated file
  • %e file name
  • %g group ID of process
  • %h host name
  • %p process ID
  • %s signal which triggered this coredump
  • %t timestamp of this coredump (number of seconds since 1970-01-01)
  • %u user ID of the dumped process

Once a modification has been made, one can execute the following command to make it take effect:

1
sysctl –p /etc/sysctl.conf

Please note: once we have set /proc/sys/kernel/core_uses_pid as 1,the generated core dump file still has pid in its name even we do not set %p in _core_pattern_.

Practice

In this section I present a small practice of coredump. Firstly I performed the following configurations (root privilege might be required) for coredump file generation:

1
2
3
4
ulimit -c 1000000
ulimit -c
echo 1 > /proc/sys/kernel/core_uses_pid
echo "/tmp/corefile/core-%e-%p-%t" > /proc/sys/kernel/core_pattern

My settings

Then let’s write some code to provoke a “Segmentation fault”. I put the code lines below into file “testSegmtFault.c “.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <stdio.h>

int main()
{
int IDs[10] = {0,1,2,3,4,5,6,7,8,9};
int * id_ptr = 0;

printf("0x%lx: %d \n", (unsigned long)IDs, *IDs);

// correct logic
for (id_ptr=IDs; id_ptr<=(&IDs[9]); id_ptr++)
{
printf("0x%lx: %d \n", (unsigned long)id_ptr, *id_ptr);
}

// bad logic
for (id_ptr=0; id_ptr<=(&IDs[9]); id_ptr++)
{
printf("0x%lx: %d \n", (unsigned long)id_ptr, *id_ptr);
}
}

Compile it then execute the generated executable file:

1
2
gcc -g testSegmtFault.c -o seg
./seg

Execution output is as follows and there is a Segmentation Fault :D:

1
2
3
4
5
6
7
8
9
10
11
12
13
qiu@qiu-ThinkCentre-E73:~/workspace/c_basics/linux_flux$ ./seg
0x7ffe283297d0: 0
0x7ffe283297d0: 0
0x7ffe283297d4: 1
0x7ffe283297d8: 2
0x7ffe283297dc: 3
0x7ffe283297e0: 4
0x7ffe283297e4: 5
0x7ffe283297e8: 6
0x7ffe283297ec: 7
0x7ffe283297f0: 8
0x7ffe283297f4: 9
Segmentation fault

Crash occurrence:

Now as expected, there is a coredump file named “core-seg-20017-1639863524” generated in /tmp/corefile/

Now let’s use gdb to debug the crash with help of the executable file and the coredump file:

1
2
cd /tmp/corefile
gdb ~/workspace/c_basics/linux_flux/seg core-seg-20017-1639863524

We can see that gdb immediately shows the defective code line number:
The debug output

Last year, I have managed the development of a WebApp for industrial supplychain sourcing in our start-up . It’s a typical B/S-architecture project whose server side was entirely developed by myself (Ubuntu running over AliCloud + Django + Neo4j + MongoDB + Docker + Tomcat + nginx + other dependences).

Alt Text

Below are some notes about a simple and powerful tool called httpie that I used for testing my APIs. These notes were taken in Chinese, however I add also simple translation in English in case someone happens to read this post and wants to understand it.

可以使用pip安装(installing using pip with a specified source):

1
pip install --upgrade httpie -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

测试get请求(testing a GET request):

1
http -v GET http://127.0.0.1:8000/suppliers_server/get_user?username=qiu

测试POST带表单请求(testing a POST request):

1
http --form POST http://127.0.0.1:8000/suppliers_server/query_login username="qiu" password="123456"

测试get请求带cookie(testing a GET request containing cookie):

1
http -v GET http://127.0.0.1:8000/corporadb/login  Cookie:sessionid=vc1d2m5kb1zul63l9g24qq5w58vfj35g

将token包含在header中(testing a GET request containing token):

1
http -v GET http://127.0.0.1:8000/suppliers_server/query_chain?product=cd Cookie:sessionid=y4nokqeymk54tdapicjjkmbve1qfuheo Authorization:eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6InFpdSJ9.qLDaly37vl77SxyHNPEqq6_HKzbSinmPcG9GvGQ-JdQ

Following previous post, I would like to illustrate the solution of the following problem (figure below) using Binary search algorithm.
Alt Text
As shown in the figure above, we are given a sorted array v = [2, 2, 2, 3, 3, 4, 5, 6, 6, 7] with repeated numbers, and we want to find the index of the first occurrence of target number 3. It’s easy for human to find that the answer should be 3 (we suppose index of the first element is 0). How for computers?

Algorithm illustration

As usual, we define three pointers: l, r and mid. They point respectively to the lower (left) bound index, the upper (right) bound index and the middle index of the searching range. We calculate mid as mid = l + (r-l)/2.

Initially, we search the entire array. So l = 0, r = 9 and mid = 4. Here is the first step:
image
Though mid points already to element whose value is equal to 3, we cannot stop searching since we are not sure if its the first occurrence in the array. So we continue to search by changing r to current value of mid. So we narrow down the searching range to the left half of the array: l = 0, r = 4 and mid is updated to 2:

Alt Text

Now the element pointed by mid is 2 and it is smaller than our target number 3. So we narrow down again the searching range to the right half of current searching range. Now l = 3, r = 4 and mid is updated to 3:

Alt Text

Now the element pointed by mid is 3 and it is equal to the target number 3. We now move r to current mid. So l = r = mid = 3. Solution found!

Alt Text

Overview

I put all the steps of this example in one figure to demonstrate overview of the solution:

Alt Text

C++ Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include <vector>
#include <iostream>

using namespace std;

/*
* First occurency search
*/
int foSearch(vector<int>& array, int target) {

int l = 0, r = array.size()-1;
while (l<r)
{
int mid = l + (r-l)/2;

if(array[mid] > target)
{
r = mid-1;
}
else if(array[mid] == target)
{
r = mid;
}
else{
l = mid+1;
}
}
cout << "l=" << l << ", r=" << r << endl;
int res = -1;
if (array[l] == target)
{
res = l;
}

return res;
}


int main()
{
vector<int> vec {1, 1, 4, 4,9, 9,12,12, 12,32,43, 43,43, 55, 55, 63, 77, 77, 98, 98, 98, 100};
int ans = foSearch(vec, 43);
cout << "ans is : " << ans << endl;
cout << "value is : " << vec[ans] << endl;

return 0;
}

“A picture is worth a thousand words”

In order to explain to a friend how Binary search algorithm works, I drew several pictures to show solution of a simple problem using this algorithm. Here I would like to share it in this post.

As shown in the figure below, we are given a sorted array v = [1, 2, 3, 4, 5, 7, 9, 10, 12, 17], and we want to find index of the target number 9. One can see easily that the answer should be 6 (suppose index of the first element is 0).

Alt Text

Binary search algorithm illustration

In Binary search algorithm, we define three pointers to elements: l, r and mid. They point respectively to the lower (left) bound index, the upper (right) bound index and the middle index of the range for searching. We calculate mid as mid = l + (r-l)/2.

Initially, we search the entire array. So l = 0, r = 9 and mid = 4. Here is the first step:

Alt Text

The element pointed by mid is 5 and it is smaller than our target number 9. So we narrow down the searching range to the right half of the array: l = mid+1 = 5, r = 9 and mid is updated 7:

Alt Text

Now the element pointed by mid is 10 and it is larger than our target number 9. So we narrow down again the searching range to its left half.

Alt Text

We repeat the same steps until the element pointed by mid is equal to the target number 9. See, we find out the index of number 9 in this array is 6.
Alt Text

Binary search algorithm code

Below is the c++ code for this algorithm:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*
* Binary search
*/
int binSearch(vector<int>& array, int target) {

int l = 0, r = array.size()-1;
while (l<=r)
{
int mid = l + (r-l)/2;
if(array[mid] == target)
{
return mid;
}
if(array[mid] > target)
{
r = mid-1;
}
else{
l = mid+1;
}
}

return -1;

}

Overview

I put all the steps of this example in one figure to demonstrate overview of the Binary search algorithm:

image

A similar problem

Here is a similar problem which can also be solved by varying a little the Binary search algorithm: in a sorted array with repeated elements, we would like to find index of the first occurrence of the target number. One can think of the solution. I will present the solution using figures in my next post.

Alt Text

Recently in some interview I have been asked about experience of implementing trained tensorflow models in android platform. I have tried one android project cloned from github which embedded a tflite model in it. However, I have not yet tried implementing my own model in an Android application. Thus I did such an exercise today and I successfully made my CNN model work on my Redmi Note 8 pro.

Alt Text

CNN model

Here is the code for training a CNN model with mnist dataset. This model then is converted as tflite model and shall be implemented in Android application for recognizing hand-write digits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = "2"
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics,models



(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
print(x_train[0,:,:])
## x_train.shape => (60000, 28, 28), y_train.shape => (60000,)
## x_test.shape => (10000, 28, 28), y_test.shape => (60000,)
x_train = tf.expand_dims(x_train, -1)
x_test = tf.expand_dims(x_test, -1)

yt = tf.squeeze(y_train)

y_train = tf.squeeze(y_train)
y_test = tf.squeeze(y_test)

print("Dataset info: ", x_train.shape, y_train.shape, x_test.shape, y_test.shape)

batch_size = 128

train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_db = train_db.shuffle(10000).batch(batch_size)

test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
test_db = test_db.batch(batch_size)

train_iter = iter(train_db)
sample = next(train_iter)
print(sample[0].shape, sample[1].shape)

## build a standard cnn model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

model.summary()


model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])

train_history = model.fit(train_db, epochs=10, validation_data=test_db)

## once the model has been trained, convert it to tflite model
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

with open('qiu_mnist_model.tflite', 'wb') as f:
f.write(tflite_model)

Implementation in Android app

I refereed to this post for obtaining the original android project. I imported the kotlin version into my Android Studio. However, there were some bugs initially when I loaded my model into it.

My own model is located to asset repository:
Alt Text

The most important thing for this work is the following Gradle setting:
Alt Text

After about 15min of debugging and code modifications, I successfully made my model work.

Check out the video (there is still accuracy issue):

I will upload the android project src code to my github repo once I finish cleaning the code and improve the performance.

reference
  1. https://www.tensorflow.org/lite/performance/post_training_quantization

  2. https://margaretmz.medium.com/e2e-tfkeras-tflite-android-273acde6588

In my startup, we have worked on some AIoT (AI + IoT) projects and we’ve done some MVPs as demo to our client. In some certain project, we have used MQTT protocol for collecting sensors data and transporting them to other smart devices. So in this post, I would like to present a simple demo for building a MQTT communication via a broker on AWS cloud.

Alt Text

MQTT (Message Queuing Telemetry Transport) is an open OASIS and ISO standard (ISO/IEC 20922) lightweight, publish-subscribe network protocol that transports messages between devices. The protocol usually runs over TCP/IP; however, any network protocol that provides ordered, lossless, bi-directional connections can support MQTT. It is designed for connections with remote locations where a “small code footprint” is required or the network bandwidth is limited.

Cloud side

On AWS server, I use docker to run an emqx service which play a role as “broker”.

1
sudo docker run --rm -ti --name emqx -p 18083:18083 -p 1883:1883 -e EMQX_ADMIN_PASSWORD="myPasswd" emqx/emqx:latest

Output of my putty terminal:
Alt Text

When this service is running, one can access to its administration page by visiting http://3.138.200.179:18083/#/:

Alt Text

Subscriber

I have a subcriber.py file which defines a mqtt client who connects to cloud broker with port 1883 and subscribes message named 'qiu_data'.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import paho.mqtt.client as mqtt

## callback upon connection
def on_connect(client, userdata, flags, rc):
print("Connected, result code: " + str(rc))

## callback upon arrival of msg
def on_message(client, userdata, msg):
print(msg.topic + " " + str(msg.payload))

client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message
# client.on_disconnect = on_disconnect

## I have not yet bindded a domain name for my AWS ECS
client.connect('3.138.200.179', 1883, 600) # 600ms: keepalive interval setting
client.subscribe('qiu_data', qos=0)
client.loop_forever() # keep running

Publisher

I have a publisher.py file which defines another mqtt client who connects to cloud broker with port 1883 and publish some data in 'qiu_data' messages.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import paho.mqtt.client as mqtt
import json
import time

def on_connect(client, userdata, flags, rc):
print("Connected with result code: " + str(rc))

def on_message(client, userdata, msg):
print(msg.topic + " " + str(msg.payload))

data = {
"type": "IoT",
"timestamp": time.time(),
"msgId":"8ed7a307-0e82-9738-xxxx",
"data":{
"temp":23.5,
"speed":46.8
}
}

param = json.dumps(data)

client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message
client.connect('3.138.200.179', 1883, 600) # 600ms for keepalive
client.publish('qiu_data', payload='hello world', qos=0)
time.sleep(1)
client.publish("qiu_data", payload=param, qos=0)

Execution

In a terminal I type python subcriber.py, and in a second terminal I type python publisher.py. See the published topics have been received in the subscriber’s terminal:

Alt Text