Updated 1 July 2023
In this blog, we will learn about implementing a Text Recognizer Using Camera and Firebase ML Kit.
With the updated firebase release, the developers have released new powers to image processing in a very easy and resource-friendly way.
Now, If you want you can use image processing and machine learning techniques in your application very easily using Firebase ML kit.
ML Kit is a mobile SDK that brings Google’s machine learning expertise to Android and iOS apps in a powerful yet easy-to-use package. Whether you’re new or experienced in machine learning, you can implement the functionality you need in just a few lines of code. There’s no need to have deep knowledge of neural networks or model optimization to get started. On the other hand, if you are an experienced ML developer, ML Kit provides convenient APIs that help you use your custom TensorFlow Lite models in your mobile apps.
— Firebase Developer Guide
Well, the firebase ML kit contains 5 options currently, which are :
We will currently focus on how we can recognize the text using the camera of an Android device.
Just a precap of what you will be able to do after reading this blog :
With these, you are good to go.
1 2 3 4 5 |
dependencies { Â // ... classpath 'com.android.tools.build:gradle:3.1.2' classpath 'com.google.gms:google-services:3.2.0' } |
1 2 3 4 |
dependencies { Â // ... Â implementation 'com.google.firebase:firebase-ml-vision:16.0.0' } |
1 2 3 4 5 6 7 8 9 10 11 12 |
//... <uses-feature android:name="android.hardware.camera" /> <uses-feature android:name="android.hardware.camera.autofocus" /> //... <application ...>  ...  <meta-data    android:name="com.google.firebase.ml.vision.DEPENDENCIES"    android:value="text" />  <!-- To use multiple models: android:value="label,text" --> </application> |
After these initial steps, you are good to start writing code for the text recognizer.
I have named my Activity as LauncherActivity.
Xml File :->
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
<?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:id="@+id/fireTopLayout" android:layout_width="match_parent" android:layout_height="match_parent" android:background="#000" android:keepScreenOn="true" android:orientation="vertical"> <com.webkul.mobikul.mlkitdemo.customviews.CameraSourcePreview android:id="@+id/Preview" android:layout_width="match_parent" android:layout_height="match_parent" android:layout_alignParentLeft="true" android:layout_alignParentStart="true" android:layout_alignParentTop="true"> <com.webkul.mobikul.mlkitdemo.customviews.GraphicOverlay android:id="@+id/Overlay" android:layout_width="match_parent" android:layout_height="match_parent" android:layout_alignParentBottom="true" android:layout_alignParentLeft="true" android:layout_alignParentStart="true" android:layout_alignParentTop="true" /> </com.webkul.mobikul.mlkitdemo.customviews.CameraSourcePreview>> <RelativeLayout android:id="@+id/control" android:layout_width="match_parent" android:layout_height="60dp" android:layout_alignParentBottom="true" android:layout_alignParentLeft="true" android:layout_alignParentStart="true" android:layout_toEndOf="@id/Preview" android:layout_toRightOf="@id/Preview" > <TextView android:id="@+id/resultsMessageTv" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_centerHorizontal="true" android:layout_centerVertical="true" android:drawableTint="@android:color/white" android:padding="6dp" android:text="@string/results_found" android:textColor="@android:color/white" /> </RelativeLayout> <LinearLayout android:id="@+id/resultsContainer" android:layout_width="match_parent" android:layout_height="wrap_content" android:gravity="bottom" android:layout_above="@id/control" > <android.support.v7.widget.RecyclerView android:id="@+id/results_spinner" android:layout_width="match_parent" android:layout_height="match_parent" tools:listitem="@layout/camera_simple_spinner_item" /> </LinearLayout> </RelativeLayout> |
Java Class File( LauncherActivity ) : ->
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
public class LauncherActivity extends AppCompatActivity { private static final String TAG = "LauncherActivity"; private CameraSourcePreview preview; // To handle the camera private GraphicOverlay graphicOverlay; // To draw over the camera screen private CameraSource cameraSource = null; //To handle the camera private RecyclerView resultSpinner;// To display the results recieved from Firebase MLKit private static final int PERMISSION_REQUESTS = 1; // to handle the runtime permissions private List<String> displayList; // to manage the adapter of the results recieved private ResultAdapter displayAdapter; // adapter bound with the result recycler view ---> Contains a simple textview with background private TextView resultNumberTv;// to display the number of results private LinearLayout resultContainer;// just another layout to maintain the symmetry @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_launcher); // getting views from the xml resultNumberTv = (TextView) findViewById(R.id.resultsMessageTv); resultContainer = (LinearLayout) findViewById(R.id.resultsContainer); resultSpinner = (RecyclerView) findViewById(R.id.results_spinner); preview = (CameraSourcePreview) findViewById(R.id.Preview); graphicOverlay = (GraphicOverlay) findViewById(R.id.Overlay); // intializing views displayList = new ArrayList<>(); resultSpinner.setLayoutManager(new LinearLayoutManager(LauncherActivity.this, LinearLayoutManager.VERTICAL, false)); displayAdapter = new ResultAdapter(LauncherActivity.this, displayList); resultSpinner.setAdapter(displayAdapter); resultContainer.getLayoutParams().height = (int) (Resources.getSystem().getDisplayMetrics().heightPixels * 0.65); resultNumberTv.setText(getString(R.string.x_results_found, displayList.size())); if (preview == null) { Log.d(TAG, " Preview is null "); } if (graphicOverlay == null) { Log.d(TAG, "graphicOverlay is null "); } if (allPermissionsGranted()) { createCameraSource(); } else { getRuntimePermissions(); } } @Override protected void onResume() { super.onResume(); startCameraSource(); } @Override protected void onPause() { super.onPause(); preview.stop(); } @Override public void onDestroy() { super.onDestroy(); if (cameraSource != null) { cameraSource.release(); } } // Actual code to start the camera private void startCameraSource() { if (cameraSource != null) { try { if (preview == null) { Log.d(TAG, "startCameraSource resume: Preview is null "); } if (graphicOverlay == null) { Log.d(TAG, "startCameraSource resume: graphOverlay is null "); } preview.start(cameraSource, graphicOverlay); } catch (IOException e) { Log.d(TAG, "startCameraSource : Unable to start camera source." + e.getMessage()); cameraSource.release(); cameraSource = null; } } } // Function to check if all permissions given by the user private boolean allPermissionsGranted() { for (String permission : getRequiredPermissions()) { if (!isPermissionGranted(this, permission)) { return false; } } return true; } // List of permissions required by the application to run. private String[] getRequiredPermissions() { return new String[]{android.Manifest.permission.CAMERA, android.Manifest.permission.INTERNET, Manifest.permission.WRITE_EXTERNAL_STORAGE}; } // Checking a Runtime permission value private static boolean isPermissionGranted(Context context, String permission) { if (ContextCompat.checkSelfPermission(context, permission) == PackageManager.PERMISSION_GRANTED) { Log.d(TAG, "isPermissionGranted Permission granted : " + permission); return true; } Log.d(TAG, "isPermissionGranted: Permission NOT granted -->" + permission); return false; } // getting runtime permissions private void getRuntimePermissions() { List<String> allNeededPermissions = new ArrayList<>(); for (String permission : getRequiredPermissions()) { if (!isPermissionGranted(this, permission)) { allNeededPermissions.add(permission); } } if (!allNeededPermissions.isEmpty()) { ActivityCompat.requestPermissions( this, allNeededPermissions.toArray(new String[0]), PERMISSION_REQUESTS); } } // Function to create a camera source and retain it. private void createCameraSource() { // If there's no existing cameraSource, create one. if (cameraSource == null) { cameraSource = new CameraSource(this, graphicOverlay); } try { cameraSource.setMachineLearningFrameProcessor(new TextRecognitionProcessor(this)); } catch (Exception e) { Log.d(TAG, "createCameraSource can not create camera source: " + e.getCause()); e.printStackTrace(); } } // updating and displaying the results recieved from Firebase Text Processor Api public void updateSpinnerFromTextResults(FirebaseVisionText textresults) { List<FirebaseVisionText.Block> blocks = textresults.getBlocks(); for (FirebaseVisionText.Block eachBlock : blocks) { for (FirebaseVisionText.Line eachLine : eachBlock.getLines()) { for (FirebaseVisionText.Element eachElement : eachLine.getElements()) { if (!displayList.contains(eachElement.getText()) && displayList.size() <= 9) { displayList.add(eachElement.getText()); } } } } resultNumberTv.setText(getString(R.string.x_results_found, displayList.size())); displayAdapter.notifyDataSetChanged(); } } |
TextRecognitionProcessor : –>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
public class TextRecognitionProcessor extends VisionProcessorBase<FirebaseVisionText> { private static final String TAG = "TextRecognitionProcessr"; private final FirebaseVisionTextDetector detector; private final LauncherActivity activityInstance; public TextRecognitionProcessor(LauncherActivity activity) { detector = FirebaseVision.getInstance().getVisionTextDetector(); activityInstance = activity; } @Override public void stop() { try { detector.close(); } catch (IOException e) { Log.e(TAG, "Exception thrown while trying to close Text Detector: " + e); } } @Override protected Task<FirebaseVisionText> detectInImage(FirebaseVisionImage image) { return detector.detectInImage(image); } @Override protected void onSuccess( @NonNull FirebaseVisionText results, @NonNull FrameMetadata frameMetadata, @NonNull GraphicOverlay graphicOverlay) { graphicOverlay.clear(); activityInstance.updateSpinnerFromTextResults(results); } @Override protected void onFailure(@NonNull Exception e) { Log.w(TAG, "Text detection failed." + e); } } |
ResultAdapter :–>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
public class ResultAdapter extends RecyclerView.Adapter<ResultAdapter.ViewHolder> { private Context context; private List<String> labelList; public ResultAdapter(Context context, List<String> labelList) { this.context = context; this.labelList = labelList; } @NonNull @Override public ViewHolder onCreateViewHolder(@NonNull ViewGroup parent, int viewType) { View view = LayoutInflater.from(context).inflate(R.layout.camera_result_item, parent, false); return new ViewHolder(view); } @Override public void onBindViewHolder(@NonNull ViewHolder holder, final int position) { ((TextView) holder.itemView.findViewById(R.id.label_tv)).setText(labelList.get(position)); } @Override public int getItemCount() { return labelList.size(); } public class ViewHolder extends RecyclerView.ViewHolder { public ViewHolder(View itemView) { super(itemView); } } } |
camera_result_item (Xml File) : –>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
<?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:layout_height="wrap_content" android:layout_width="match_parent" android:orientation="vertical" > <TextView android:id="@+id/label_tv" android:layout_width="match_parent" android:layout_height="wrap_content" android:minHeight="48dp" tools:text="label"/> <View android:layout_width="match_parent" android:layout_height="1dp" android:background="#a8ffffff" /> </LinearLayout> |
Some other demos :
Sources :
https://firebase.google.com/docs/ml-kit/android/recognize-text
https://github.com/firebase/quickstart-android/tree/master/mlkit/
Keep coding and Keep Sharing 🙂
If you have more details or questions, you can reply to the received confirmation email.
Back to Home
29 comments
The code in the blog is a complete reference.
Please do let me know which part in the blog you think is incomplete or you are unable to link with.
I will try to improvise that part of the blog so that it helps all the readers.
the camera_simple_spinner_item layout missing is something used in the namespace tools, you can easily remove that line or replace that layout with the camera_result_item whose code is shared in the blog itself.
Regarding the black screen on the camera, for this the issue is that the CameraSourcePreview class is not initiating correctly in your code. Please do debug that code or share your code with us and we will help you with the same.
Hope this helps you.
Thanks for your comment, but currently i cannot put this code on Github repository.
You can ask me over here the issue you are facing and I will try to reply as soon as possible for me.
For Example, i have recieved the results in the method “onSuccess” of TextRecognitionProcessor.java which is a child of VisionProcessorBase
I personally modified the constructor of TextRecognitionProcessor.java and passed the instance of Launcher activity in the constructor.
You can also implement the same using interface as well.
If you still have any doubts, do let me know, i will help you as much as i can.
Can you post the code on how to do this (including the call back to the Launcher.
“I personally modified the constructor of TextRecognitionProcessor.java and passed the instance of Launcher activity in the constructor.”
thanks heaps
Kingsley
Can you post the code snippets on how you did this?
“I personally modified the constructor of TextRecognitionProcessor.java and passed the instance of Launcher activity in the constructor.”
including the call back to the Launcher method. thanks heaps
The constructor of TextRecognitionProcessor.java will be like :
public TextRecognitionProcessor(LauncherActivity activity) {
detector = FirebaseVision.getInstance().getVisionTextDetector();
activityInstance = activity;
}
Now in this your onSuccess method will be something like this :
@Override
protected void onSuccess(
@NonNull FirebaseVisionText results,
@NonNull FrameMetadata frameMetadata,
@NonNull GraphicOverlay graphicOverlay) {
graphicOverlay.clear();
activityInstance.updateSpinnerFromTextResults(results);
}
I am also adding this file in the article as well, so that you can refer it completely.
If still you have any confusions, please feel free to ask.
appreciate the quick update too.
But the problem you are stating is most probably due to improper use of CameraSourcePreview Object.
Please cross check that before you access the camera have you initialized the camera source and then started it in on resume of your activity.
Please do check first if you have added the camera permission in your manifest file and are also checking for it as per the android api level.
Can you share me the code as i need it as very urgent basis and the MLKIT library is outdated.
Yes, the MLKIT library dependency mentioned in the blog is outdated, but then this is quite a common thing for firebase dependencies. You can still use this version if you want or even update, it is completely up to you and your use case.
Secondly, the blog in here is a complete reference to tell you how you can use this functionality.
I do not have any code other than the files shared in the blog.
Still, If you want you can share your code and I will try to help you out
What exactly is the error you are facing?
Please do share some insight about the error so that I can look into it & help you.
ResultAdapter is just a simple recycler view Adapter class in which I have just inflated a text view and a background(just to make the views presentable).
I have also mentioned the same in the comment along with the declaration of the ResultAdapter object.
Still, if you face any issue, then do let me know.
Can you share the ResultAdapter code here?
Please do have a look.
Can you share the ResultAdapter code here?
I have added the same in the last section of the Blog.
Please do have a look.