Find out the common parts of all the strings
By : Johann
Date : March 29 2020, 07:55 AM
I hope this helps . Build an array of all possible substrings, sort them and then look for blocks of consecutive equal strings. The implementation below looks for suffixes of a certain length and imposes a minimal number of matches. It is not clear what you want exactly, but you need some constraints. It is easy to look for the longest common suffixes, but if you just want common suffixes, what does that mean? Are 20 occurences of a 4-character string better than 10 occurrences of a 5-character string? code :
/*
* Return an array of all substrings of the given length
* that occur at least mincount times in all the strings in
* the input array strings.
*/
function substrings(strings, length, mincount) {
var suffix = [];
var res = [];
for (var i = 0; i < strings.length; i++) {
var s = strings[i];
for (var j = 0; j < s.length - length + 1; j++) {
suffix.push(s.substr(j, length));
}
}
suffix.sort();
suffix.push("");
var last = "";
var count = 1;
for (var i = 0; i < suffix.length; i++) {
var s = suffix[i];
if (s == last) {
count++;
} else {
if (count >= mincount) res.push(last);
count = 1;
}
last = s;
}
return res;
}
|
Find relevant parts in a collection of strings
By : Mohammed Xenon
Date : November 09 2020, 08:00 AM
With these it helps Here is a little hack, I do not say it is optimal nor nothing but it could be interesting to follow this path if no other option is available: code :
String[] reversedPaths = new String[paths.length];
for (int i = 0; i < paths.length; i++) {
reversedPaths[i] = StringUtils.reverse(paths[i]);
}
String suffix = StringUtils.reverse(StringUtils.getCommonPrefix(reversedPaths));
|
How can I find common parts of 3 or more strings?
By : Ruey Luu Richie
Date : March 29 2020, 07:55 AM
it should still fix some issue I think you need a suffix tree ( wikipedia). Build the suffix tree for each document. If you don't care about individual characters feel free to use words instead of characters. Once you have this, you need to find the longest path from the root that is found in all (or most) of the individual suffix trees. So just pick one, get the root of all nodes and do a DFS, going down a link only if you find it in all (or sufficiently many) trees. This will iterate through all sub-strings that are common in all the documents.
|
Find non-intersecting parts of a strings vector
By : Adrian
Date : March 29 2020, 07:55 AM
Any of those help Data Woops! Forgot to mention that you had a typo in the last element of vec3.
|
Find matchesm between parts of strings in Excel
By : Joris R.
Date : October 04 2020, 09:00 AM
fixed the issue. Will look into that further I have a table with urls, like , You could try: code :
=TRIM(MID(SUBSTITUTE(A1,"/",REPT(" ",LEN(A1))),2*LEN(A1)+1,LEN(A1)))=TRIM(MID(SUBSTITUTE(B1,"/",REPT(" ",LEN(B1))),2*LEN(B1)+1,LEN(B1)))
|