Find Jobs
Hire Freelancers

Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity

£20-250 GBP

I përfunduar
Postuar about 4 years ago

£20-250 GBP

Paguhet në dorëzim
I need .NET code to highlight (and extract and concatenate) 2 or 3 or 4 separate text elements in a pdf, based on regular expressions and their proximity to each other (see attached image). 1.) I am using the DevExpress PdfDocumentProcessor to obtain the document text and coordinates using the [login to view URL] property 2.) Then, use the standard Regex class to get all substrings in the given string (the text returned with the [login to view URL] property) that matches your regular expression. Example text in Pdf: 240 TT 12345 Example Regex (should find the elements above individually): 1st Line: 3 Numeric Chars: ^\d{3}$ 2nd Line: 2 Alpha Chars: ^[A-Z]{2}$ 3rd Line: 5 Numeric Chars: ^\d{5}$ Criteria: All 3 text elements share same text height All 3 text elements have same (or close +- 10%) X coordinate Value All 3 text elements are within Y coordinates value of Char height * 4 +- 10% of each other Example text in Pdf: 240 TT 12345 Required concatenated string: 240TT12345 I'm guessing the workflow would be something along the lines of: Open pdf Extract all text elements Find text matching first line of regex Is there a text element the same character height with the same X coordinate value below this element (within the height of the text element above +-10%)? Is there another text element the same character height with the same X coordinate value below this element (within the height of the text element above +-10%)? If there is, extract all text elements, concatenated to string, e.g. 240TT12345 Highlight the elements in the pdf. I would class myself as an intermediate coder, but I'm really struggling here because the number of lines to search using regex can be 2, sometimes 3, maybe 4. Perhaps a LINQ query to find all by Regex and proximity however happy to see all suggestions.
ID e Projektit: 24739628

Rreth projektit

9 propozime
Projekt në distancë
Aktive 4 yrs ago

Po kërkoni të fitoni para?

Përfitimet e ofertës për Freelancer

Vendosni buxhetin dhe afatin tuaj
Paguhuni për punën tuaj
Përshkruani propozimin tuaj
Është falas të regjistrohesh dhe të bësh oferta për punë
I dhënë për:
Avatari i Përdoruesit
I have FULL CONFIDENCE of lending you a hand in sorting out your Regular Expressions problem and I am ready to start IMMEDIATELY. QUESTIONS/COMMENTS 1) It will be much beneficial if you can upload a small sample of [login to view URL] property that you are going to parse. I am asking this because I think it will contain contents of multiple text elements and I can clearly see what you mean by proximity. 2) Exactly what do you mean by text element? As I see it in attached image, there are 3 "figures" joined by dashed arrows. 1st figure contains 240-TE-24381, 2nd figure contains 240-TT-24381, and 3rd contains 240-TI-24381. Does the "figure" (e.g. 240-TE-24381) corresponds to a text element or individual parts within the figure, viz. 240, TE, 24381, constitute a text element? 3) I have not followed how X or Y offsets are related to RegExes. Please explain. EXPERIENCE Although new to Freelancer.com, I have EXTENSIVE experience in Regular Expressions and I am pretty much familiar with the RegEx “flavour” as implemented in .NET. Thus, I know that named capturing groups in .NET use (?<id>\w+) or (?'id'\w+) format while the syntax for named capturing groups is (?P<id>\w+). In addition to “regular” concepts such as Character classes, Anchors, Word boundaries, etc. I am also very much at home with concepts such as Atomic Grouping, Lookahead and Lookbehind. Thanks, Tushar
£69 GBP në 4 ditë
5,0 (9 përshtypje)
4,5
4,5
9 profesionistët e pavarur ofrojnë mesatarisht £141 GBP oferta për këtë punë
Avatari i Përdoruesit
Hello, I can help you with your project - Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Please visit : https://www.freelancer.com/u/schoudhary1553 I have excellent command over English. I am a hard worker, productive and worthy of your attention I hope, I would be the right candidate for this post. Awaiting an affirmative response from you. Kinds Regards, Sandeep
£220 GBP në 4 ditë
5,0 (35 përshtypje)
5,9
5,9
Avatari i Përdoruesit
I am PDF expert, I can write code to extract from raw pdf without libraries, it work for simple pdfs only. I hope your pdf like it, please send it to check
£200 GBP në 3 ditë
5,0 (3 përshtypje)
3,8
3,8
Avatari i Përdoruesit
Hi Claire J.! I'm a Graphic Designer, with over 6 years experience based in Vancouver, Canada. I've previously worked on pdf, vb.net for another employers. Please see my portfolio @ www.visak2691.com. I look forward to working on this project with you. Thank you, Vishakh
£142 GBP në 7 ditë
4,5 (6 përshtypje)
3,7
3,7
Avatari i Përdoruesit
-- VB.NET expert with PDF processing experience .......... Interested to do your project for regex matching ...........
£145 GBP në 7 ditë
4,9 (7 përshtypje)
3,1
3,1
Avatari i Përdoruesit
hello,dear. I have read all your requirements for 'Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity' and I fully understood it. I've already done this kind of project before. I am confident and I am sure that I am able to finish this project. Please come in contact with me, so that we can discuss any details via chat:) Skills: PDF, VB.NET
£150 GBP në 1 ditë
5,0 (2 përshtypje)
2,5
2,5
Avatari i Përdoruesit
10+ years experience in C# Have experience in processing inconsistent Excel & PDF files. Can complete in a day.
£111 GBP në 1 ditë
5,0 (1 review)
1,9
1,9
Avatari i Përdoruesit
i need this project i do best work for you any employer contact me. i am professional data entry work,
£135 GBP në 7 ditë
0,0 (0 përshtypje)
0,0
0,0

Rreth klientit

Flamuri i UNITED KINGDOM
Bagshot, United Kingdom
5,0
4
Mënyra e pagesës u verifikua
Anëtar që nga sht 8, 2017

Verifikimi i klientit

Faleminderit! Ne ju kemi dërguar me email një lidhje për të kërkuar kredinë tuaj falas.
Ndodhi një gabim gjatë dërgimit të email-it tuaj. Ju lutemi provoni përsëri.
Përdorues të regjistruar Punë të postuara
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Po ngarkohet shikimi paraprak
Leja u dha për Geolocation.
Seanca e hyrjes ka skaduar dhe ke dalë. Hyr sërish.